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Abstract 

The global burden of headache is very large, but knowledge of it is far from complete and needs still to be gathered. 
Published population-based studies have used variable methodology, which has influenced findings and made 
comparisons difficult. Among the initiatives of the Global Campaign against Headache to improve and standardize 
methods in use for cross-sectional studies, the most important is the production of consensus-based methodological 
guidelines. This report describes the development of detailed principles and recommendations. For this purpose we 
brought together an expert consensus group to include experience and competence in headache epidemiology and/or 
epidemiology in general and drawn from all six WHO world regions. The recommendations presented are for anyone, of 
whatever background, with interests in designing, performing, understanding or assessing studies that measure or 
describe the burden of headache in populations. While aimed principally at researchers whose main interests are in the 
field of headache, they should also be useful, at least in parts, to those who are expert in public health or epidemiology 
and wish to extend their interest into the field of headache disorders. Most of all, these recommendations seek to 
encourage collaborations between specialists in headache disorders and epidemiologists. The focus is on migraine, 
tension-type headache and medication-overuse headache, but they are not intended to be exclusive to these. The 
burdens arising from secondary headaches are, in the majority of cases, more correctly attributed to the underlying 
disorders. Nevertheless, the principles outlined here are relevant for epidemiological studies on secondary headaches, 
provided that adeguate definitions can be not only given but also applied in guestionnaires or other survey instruments. 
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Preface 

The importance of epidemiological studies for headache 
science is increasingly recognized. Apart from enhancing 
our understanding of disease origins patterns, aetiology and 
risk factors, thereby improving opportunities for treatment 
and prevention, these studies inform needs-assessment, 
underpin service policy and gain acceptance of headache 
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disorders as a public-health priority [1]. The methods, and 
quality, of published studies are variable [2,3]. The need for 
better and standardized methodology, supported by guide- 
lines for the design and performance of these studies, has 
been evident for some years [3,4]. 

These recommendations have been developed with this 
need in mind and are for anyone, of whatever background, 
with interests in designing, performing, understanding or 
assessing studies that measure or describe the burden of 
headache in populations. While aimed principally at re- 
searchers whose main interests are in the field of headache, 
they should also be useful, at least in parts, to those who 
are expert in public health or epidemiology and wish to 
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extend their interest into the field of headache disorders. 
Most of all, these recommendations seek to encourage col- 
laborations between specialists in headache disorders and 
epidemiologists. 

The development initiative came from Lifting The 
Burden (LTB) a UK-registered charitable nongovernmental 
organization, which directs the Global Campaign against 
Headache in official relations with the World Health 
Organization (WHO). An expert consensus group was 
convened by LJS and TJS with two considerations in 
mind: to include experience and competence in head- 
ache epidemiology and/or epidemiology in general, and 
to have international and cross-cultural relevance. To the 
latter end, members were drawn from all six WHO world 
regions (see Table 1). 

Publication of these recommendations follows a call 
for them [3] this supported by a review which identified 
multiple methodological shortcomings in published stud- 
ies, many with adverse impact on quality. Because these 
recommendations should stand alone, some of the argu- 
ment in this review is set out again here. 

Introduction 

Purpose 

The main aim of these recommendations is to improve 
the quality of studies of headache prevalence and burden. 
Much money and time are committed to such studies. We 
believe that attention to the principles set out here, and 
adherence to the recommendations, will ensure that these 
resources are better spent. We further believe these rec- 
ommendations will not only aid the planning and per- 
formance of new studies but also assist evaluation of 
previously published studies. 

A second aim is to make studies more comparable. 
Several attempts have been made to summarize the 



Table 1 Members of the expert consensus group 



African Region 


Gretchen L Birbeck (GLB) (Zambia) 




Redda Tekle-Haimanot (RTH) (Ethiopia) 


Region of the Americas 


Gretchen L Birbeck (GLB) (USA 




Luiz Paulo Queiroz (LPQ) (Brazil) 




Ann 1 Scher (AIS) (USA) 


Eastern Mediterranean Region 


Mohammed Al Jumah (MAJ) (Saudi Arabia) 


European Region 


Rigmor Jensen (RJ) (Denmark) 




Zaza Katsarava (ZK) (Georgia and Germany) 




Lars Jacob Stovner (US) (Norway) (chairman) 




Timothy J Steiner (TJS) (UK and Norway) 


South East Asian Region 


Gopalakrishna Gururaj (GG) (India) 


Western Pacific Region 


Shuu-Jiun Wang (SJW) (Taiwan) 


World Health Organisation 


Somnath Chatterji (SC) (Switzerland) 


World Health Organisation 


Tarun Dua (TD) (Switzerland) 



results of different studies [2,5]. In this effort, it has be- 
come evident that studies are performed and reported in 
quite different ways, making it very difficult to interpret 
and summarize their results and, particularly, to com- 
pare results from studies performed in different settings 
and countries or at different times. Also, while the qual- 
ity of studies is very variable, otherwise well-performed 
studies are sometimes reported in too little detail for the 
results to be useful in comparison with those of other 
studies. Hence, guidelines that will help in making stud- 
ies more uniform as well as of better quality have been 
called for [3,4]. 

Scope 

The focus of this document is on adult studies. Many of 
the recommendations and accompanying discussions ap- 
ply equally to studies of children and/or adolescents, but 
some recommendations will need adaptation. Some major 
issues - relating, for example, to sampling methods, diag- 
nosis, description and measurement of burden, and con- 
sent - are specific for children. Although a short section is 
included for these age groups, these recommendations 
may not be adequate for them in all of these respects. At a 
later date, a lengthier section devoted to these issues will 
be created in collaboration with the necessary expertise. 
There are also some considerations specific to studies of 
the elderly, and attention is drawn to these where relevant. 

In general these recommendations apply to prevalence 
of and burden generated by primary headache disorders, 
mainly migraine and tension-type headache (TTH). They 
are not intended to be exclusive to these, although bur- 
dens arising from secondary headaches are, in the major- 
ity of cases, more correctly attributed to the underlying 
disorders. An exception is medication-overuse headache 
(MOH), also secondary by definition but included be- 
cause it arises, on present understanding, only in the 
context of a pre-existing and usually primary headache 
disorder and may be considered a sequela of this. Un- 
questionably, MOH contributes to public ill-health [6], 
and these recommendations do embrace it. 

The recommendations also encompass other headaches 
occurring on >15 days/ mo nth, again because they cer- 
tainly contribute to public ill-health. It is acknowledged 
later that these may be poorly characterized and, within a 
survey conducted by enquiries at single points in time, im- 
possible to diagnose more specifically than as frequent 
headache. 

Specific recommendations have not been formulated for 
other secondary headaches, because our understanding of 
many of these disorders is incomplete and evolving, their 
causes and mechanisms are often unknown and their diag- 
nostic criteria - particularly in relation to causation - 
often too uncertain to be usable in epidemiological studies. 
Nevertheless, the principles oudined in this document are 
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relevant for epidemiological studies on secondary head- 
aches, provided that adequate definitions can be not only 
given but also applied in questionnaires or other survey 
instruments. 

The methodological issues in studies on headache preva- 
lence and burden arise mostly in three key areas. The first 
of these concerns the population of interest: how this is de- 
fined, and how the people who are part of it are sampled, 
reached, identified as cases or not and engaged in appro- 
priate enquiry. The second concerns diagnosis, where this 
is material to the enquiry, and the collection and reporting 
of data that adequately describe and quantify the different 
components of headache burden. The third concerns qual- 
ity control, data management, analysis and publication. 
Each of these is dealt with in these recommendations. 

Lastly it should be stated that this document is not a 
textbook on epidemiology. There are many issues of a 
general nature better explained in such textbooks. We 
recommend that studies on headache epidemiology and 
burden are planned and performed in collaboration with 
an epidemiologist familiar with the local population and 
culture(s). With regard to the reporting of studies, the 
STROBE (Strengthening the Reporting of Observational 
Studies in Epidemiology) statements (http://www.strobe- 
statement.org/) [7,8] should be consulted. Here in this 
document are recommendations that relate specifically to 
studies of headache, or address issues of particular rele- 
vance to headache, although some more general discus- 
sion is included where it is needed to support these. 

The burden of headache 

"Burden of disease" at a population level is the sum of the 
negative effects a disease has on the individuals within that 
population, together with any societal burden. It is a com- 
plex construct, composed of many different elements. Not 
all of the affected individuals actually have the disease. 

Most primary headaches are episodic, giving rise directly 
but intermittently to symptom burden. In a minority of 
cases, symptoms and symptom burden persist more or less 
continuously, albeit variably in their intensity. Symptoms 
include pain, which may be accompanied by nausea, vomit- 
ing and photo- and/or phonophobia. While present, these 
symptoms may cause debility and prostration, and reduce 
functional ability. This secondary disability burden is of 
magnified importance because headache is most common 
in people between their teens and 50-60 years of age - the 
productive years. Its consequences include inability to work 
or, when work is continued, reduced effectiveness and 
productivity. Attached to this may be an individual financial 
burden from lost pay. 

Given that headache attacks are unpleasant, it is hardly 
surprising that people who experience them frequently 
worry about when the next may occur. Neither is it surpris- 
ing that this can provoke anxiety, and avoidance behaviour. 



People with migraine in particular may identify triggers, 
which they attempt to eliminate by lifestyle compromise. 
Leisure activities may be cancelled or curtailed because of 
headache; when many have been cancelled, social events 
are likely not to be planned in the first place. Social life 
between attacks may simply cease. These are elements of 
interictal burden, which may continuously affect wellbeing 
and be sufficient to impair quality of life. 

A consequence of recurring inability to work may be a 
reputation for poor reliability, or inability to cope. De- 
creased probability of promotion follows, with failure to 
develop full career potential; in egregious cases, early re- 
tirement may be a forced result of persistent ill health. A 
consequence of lost school-time, affecting education, 
may be reduced career opportunities later. In both cases, 
the result is lower pay and impaired financial security. 
Over a lifetime, the cumulative burden of financial losses 
can be substantial. 

People with headache bear much of these burdens, but 
not all. Burden on others, unaffected by headache them- 
selves, arises in several ways. Failed or abdicated social 
roles during attacks affect partnership and parenthood. 
Family and friends lose the companionship they reason- 
ably expect, but which is not given by a person shut 
away in a dark room. Children may not always be looked 
after. Partners and other family members may inherit in- 
creased shares of chores and responsibilities. They may 
acquire a carer burden, called to look after the person 
with headache. Carers, as well as the sufferer, can lose 
time from work. 

Similarly, employers and work colleagues carry part of 
the burden of headache when paid work is not done. 
Either the employer pays for nothing, or colleagues must 
take on extra duties to make up. 

Consultations with physicians, and investigations, occupy 
time. Health-care resource consumption may be paid for 
by people with headache, but, when direct treatment costs 
for a condition affecting over 10% of the population are re- 
imbursed by a State-funded health system, the societal eco- 
nomic burden is substantial. Nevertheless, as a contributor 
to this element of burden, these costs are dwarfed by the 
indirect cost to national economies of absenteeism and re- 
duced effectiveness at work [6,9]. 

Headache epidemiology 

An accepted definition of epidemiology is "the study of 
distribution and determinants of disease frequency in hu- 
man populations" [10]. Epidemiological studies are often 
classified as descriptive (setting out the distribution of 
disease among different groups) or analytical (elucidating 
determinants, ie, risk factors or causes). 

The importance of headache disorders to public health, 
and therefore as a subject of epidemiological study, is very 
evident from the description above of headache burden. 
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Not only is this reflective of substantial public ill health 
but also, of course, it makes high demands for health care 
to relieve it. Descriptive epidemiological studies are the 
fundamental basis of needs-assessments for health-care 
provision, both for entire populations and for various sub- 
groups within it whose needs may be different. 

The usual focus of descriptive epidemiology in head- 
ache is prevalence: an estimate of how common a dis- 
ease is, expressed as the total number of cases in a 
population (numerator) divided by the number of individ- 
uals in that population (denominator). Other important 
and related concepts in descriptive epidemiology are inci- 
dence (a measure of the risk of newly developing a condi- 
tion within a specified period of time), which is expressed 
as a rate, remission (the probability of a case becoming 
a non-case, through natural history or intervention, again 
within a specified period of time), and duration (the period 
between onset and remission). Also related is mortality, but 
this has little relevance to primary headache disorders. 
Clearly, incidence, duration, remission and prevalence are 
tied together not only conceptually but also mathematically, 
the last being the steady-state consequence of the first two, 
often summarized by the formula P = ID, where P = (point) 
prevalence, I = incidence per year and D = duration in years. 

Headache terminology is unfortunately inconsistent, 
owing to the fact that headache is a chronic disorder with 
episodic manifestations. An "active headache disorder" is 
essentially characterized by the occurrence of symptoms 
within the previous year [11]. Prevalence studies that 
adopt this definition of a case (ie, an individual who re- 
ports at least one headache episode during that time) ne- 
cessarily use a timeframe of 1 year and usually report the 
findings as 1-year prevalence. Strictly speaking, however, 
this estimates the number of current cases (point preva- 
lence). A different enquiry may define a case only when 
symptoms are actually present ("headache now"); this also 
estimates point prevalence, but of headache attacks, not of 
a headache disorder. The terms "incidence" and "remis- 
sion" can, similarly, be applied either to headache attacks 
or to headache disorders, although in practice these terms 
have generally been used to refer to headache disorders, 
using specified time periods {eg, incidence rate = number 
with first onset of headache per 100 person-years; remis- 
sion rate = % of cases of headache disorder who then have 
no further attacks during 1 year of follow-up). All of these 
terms must be used carefully to avoid confusion. 

The large burden attributable to headache disorders 
should equally motivate analytical epidemiological stud- 
ies to determine risk factors and causes of headache, 
some of which may be preventable. This may not be 
possible within the same studies that measure prevalence 
and burden, requiring, in addition, the collection and 
registration of relevant exposure data and follow-up stud- 
ies. These requirements are often best fulfilled by larger 



epidemiological surveys unrestricted to a single disorder 
or set of disorders (see page 26). 

These recommendations are of equal relevance to de- 
scriptive and analytical studies. But which of these is the 
purpose of the study must always be clear at the outset: 
it affects, fundamentally, the design of the study, the choice 
of study population, the size of the study sample and the 
information to be collected. 

Working process 

Members of the expert consensus group were invited to 
participate in February, 2011. A review of the relevant 
literature had been performed already in connection with 
previous initiatives to document the prevalence of headache 
in Europe [5,12] and globally [2], the burden of headache 
[13], and methodological issues arising from their measure- 
ment [4]. A first draft of these recommendations was pre- 
pared by LJS and TJS and distributed to the members for 
comment in June, 2011, and both the draft and the com- 
ments received were debated in depth at a meeting of the 
group in Trondheim, Norway, during lst-3rd September, 

2011. The group drew also upon their recent experience 
from population-based studies conducted by Lifting The 
Burden [14-17]. A second draft emerged, and was circu- 
lated for further comments; a third draft, amended ac- 
cording to these, followed in early February, 2012; and a 
fourth draft, with comments assimilated by LJS and TJS, 
was approved by the group as a consultation draft in 
June, 2012. 

Approval from the International Headache Society to 
post this draft on their website for public worldwide con- 
sultation was requested in June and given in November, 

2012. The consultation took place throughout January and 
February, 2013, and appropriate comments and suggestions 
arising from it were incorporated by LJS and TJS in April, 

2013. The resultant fifth draft was sent to group members 
for approval and, with final amendments, became this doc- 
ument. The entire process took nearly three years. 

Summary of principles 

Ethical issues 

All studies should conform to the ethical principles of: 

• Respect for autonomy; 

• Non-maleficence; 

• Beneficence; 

• Justice. 

A study is unethical when: 

• It does not conform to norms and ethics of the 
country in which it is performed; 

• It is conducted without consent; 

• It utilizes resources wastefully. 
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A study may be unethical if: 



be replicated. 

The population of interest 

The population of interest should: 

• Be selected to match the purpose of the study, and 
the selection explicitly justified; 

• Be described in sufficient detail that the study can 
be compared with others. 

Control of bias 

• Biases are inevitable, but should be minimized. 

• The likely causes of bias should be identified at 
outset, and due steps taken to manage them. 

Sample selection and avoidance of selection bias 

• Sampling introduces multiple opportunities for 
selection bias, which should be recognized 
and controlled. 

• Sampling aims, first and foremost, to produce a 
surveyable group, within the population of interest, 
who are representative of it for all variables that may 
influence the objects of measurement. 

• The sample must be of sufficient size to achieve 
the study purpose(s), and not so large as to 
waste resources. 

• Sampling identifies individuals to be surveyed but 
usually does not of itself provide a means of access 
to these individuals, while the means of access is an 
important consideration to take into account when 
determining the sampling method. 

• The sampling method should be explicitly justified. 

• The sampling method should be described in 
sufficient detail that the study can be replicated. 

• Biases that might have arisen from the sampling 
procedure should be identified and discussed. 



Accessing and engaging participants 

Access is necessary to engage participants in the 
survey enquiry. 

In some cases, sampling of the population of 
interest and access to those identified as members 
of the sample are achieved simultaneously, in a 
single process. 

Incomplete access to all identified members of the 
sample introduces multiple opportunities for 
selection bias, which should be recognized 
and controlled. 

The access method should be explicitly justified. 
The access method should be described in sufficient 
detail that the study can be replicated. 
Access to potential participants does not of itself 
provide a means of engagement with them, but 
means of engagement is an important consideration 
to take into account when determining the means 
of access. 

• Incomplete engagement of accessed members of 
the sample introduces further opportunities for 
selection bias, which should be recognized 

and controlled. 

• The method of engagement should be 
explicitly justified. 

• The method of engagement should be described in 
sufficient detail that the study can be replicated. 

• Biases that might have arisen from incomplete 
access and/or engagement should be identified 
and discussed. 

Participation rate and non-participation 

• The participation rate must be calculated. 

• Low participation rate increases risk of selection bias. 

• Likely causes of non-participation should be recognized 
at the outset, and controlled. 

• A listing of the reasons for non-participation should 
be given, ideally in subgroups according to age 

and gender. 

• Biases that might have arisen from non-participation 
should be identified and discussed. 

Method of enquiry 

The method of enquiry should: 

• Be suited to the purpose of the study; 

• Encourage participation; 

• Capture the necessary data; 

• Take due account of available resources, the general 
conditions in the area(s) where the study will be 
performed and the characteristics of potential 
participants; 



• It does not respect issues of equity; • 

• It is under-resourced, so that its purpose cannot 
be achieved; • 

• It does not adhere to scientific principles. 

Study design 

The study design should: • 

• Match the purpose of the study; 

• Be explicitly justified; 

• Take due account of available resources and the • 
general conditions in the area(s) where the study • 
will be performed; 

• Be described in sufficient detail that the study can • 
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• Be tested in the specific setting of the study; 

• Be adequately described. 

Case definition and diagnosis 

• Headache caseness should be defined precisely, 
including its timeframe. 

• The screening question should be neutral, unless a 
good reason exists for it to be otherwise, and 
reported verbatim. 

• Diagnoses should be based on current ICHD 
criteria, but modifications are always necessary; 
these should be both explained and justified. 

• Diagnostic instruments should be applied in the 
language of intended participants. 

• Diagnostic instruments require validation in the 
language and specific setting of the study. 

• Diagnoses should be made algorithmically from the 
recorded responses to diagnostic questions, and not 
by the interviewer. 

• The value of diagnosing different headache types 
and subtypes should be critically assessed in the 
light of the study's purpose. 

Pilot study 

A pilot study: 

• Identifies problems before full commitment of 
resources, and therefore: 

• Supports optimum study design, logistic planning 
and effective conduct of the main study; 

• Avoids potentially major wastage of resources; 

• Is ethically desirable; 

• Helps in training of research staff, especially 
interviewers. 

Burden estimation 

Enquiries into burden should be: 

• Relevant; 

• Comprehensive (in relation to the purpose of 
the study); 

• Comprehensible to the respondent; 

• Amenable to analysis. 

Studies focused on economic burden of headache 

Comprehensive enquiries into economic burden: 

• Should include both direct and indirect costs; 

• Must have reliable data sources for attributing costs; 

• Must be based on population studies, not patient 
samples; 

• Should preferably be planned and performed in 
collaboration with a health economist. 



Data collection and storage 

Data management and storage processes should: 

• Be designed to minimize error; 

• Be carried out meticulously; 

• Respect the privacy of participants and accord with 
the laws and regulations for data storage in the 
country. 

Reporting the study 

• Poor reporting diminishes the value of otherwise 
good studies. 

• Reference to the STROBE statement is 
recommended. 

• Methods should be reported in sufficient detail to 
permit replication (which may require that they be 
published separately). 

• Results should be reported in accordance with the 
study's purpose(s). 

• Discussion should show that the study objective was 
achieved, or explain why it was not. 

Ethical issues 

Ethical issues in epidemiological studies arise in planning, 
conducting and reporting them. 

Principles 

All studies should conform to the ethical principles of: 

• Respect for autonomy; 

• Non-maleficence; 

• Beneficence; 

• Justice. 

A study is unethical when: 

• It does not conform to norms and ethics of the 
country in which it is performed; 

• It is conducted without consent; 

• It utilizes resources wastefully. 

A study may be unethical if: 

• It does not respect issues of equity; 

• It is under-resourced, so that its purpose cannot be 
achieved; 

• It does not adhere to scientific principles. 
Commentary 

The expert consensus group recognized the several eth- 
ical principles firmly and universally established in med- 
ical practice. These include respect for the autonomy of 
patients, non-maleficence, beneficence and justice [18], 
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the last with particular reference to resource allocation 
in a context of limited resources (distributive justice). 
Medical professional ethical principles include veracity 
(truth-telling), fidelity (the keeping of promises) and con- 
fidentiality, and there are, in addition, more general ap- 
proaches to medical ethics based, for example, on human 
rights, the needs of patients, the responsibilities of doc- 
tors, the good of society as a whole, and deserts. These are 
discussed in detail elsewhere, specifically in relation to 
sponsored and non-sponsored clinical research into head- 
ache [19,20]. 

Approval of a study by an appropriate ethics review 
board (ERB) is mandatory. Usually this should be ob- 
tained from a local ERB. Where such does not exist, ap- 
proval is required from another legitimate, authoritative 
and competent source, such as the WHO Research Ethics 
Review Committee. Data protection (see page 7) must also 
be given due consideration [20], and may, according to 
the laws of the country in which the study is conducted, 
require additional approvals. 

Consent in surveys is often implicit, and continuing: 
respondents renew consent each time they provide an 
answer to an enquiry. Formal written consent at the start 
of the enquiry provides only evidence of consent at that 
moment, and may serve litde purpose for that reason (al- 
though an ERB may require it). Consent must be informed, 
which requires that the purpose and nature of the survey 
are explained to the participants' satisfaction. Consent 
must be also voluntary, and participants allowed to with- 
draw from the study whenever they may wish. 

Inducements to take part in a study that carries no 
risk of harm to the participant do not directly raise con- 
cerns: both parties benefit - the researcher from the 
subject's participation and the subject from the induce- 
ment. However, they may raise concerns indirectly, since 
inducements are not of equal value to all potential sub- 
jects. Monetary inducements are more attractive to poorer 
people, and this does not respect the principle of equity. 
For the same reason, monetary inducements (and prob- 
ably inducements of any sort) are likely to increase partici- 
pation bias (see page 9). 

When the inducement is the offer of needed health 
care, either free or to which the research subject would 
not otherwise have ready access, he or she may have lit- 
tle choice but to participate. Still the participant benefits, 
and, in the absence of risk of harm, even this may be 
considered acceptable. On the contrary, it may be ar- 
gued that collecting data from needy people while offer- 
ing them nothing in return is objectionable. Justification 
is forthcoming when the purpose of the survey is needs- 
assessment - to inform the development of health ser- 
vices, which, later, will be provided for the benefit of the 
population being surveyed. But there is an important 
question, which goes beyond the ethics of reward: to what 



extent is there a duty of care upon researchers when un- 
treated and possibly serious illness is discovered by a sur- 
vey? This question is sometimes raised - especially in 
low-income countries. These recommendations cannot 
give a general answer: this must be a matter for local ERBs. 
Two points can be noted: first, surveys are commonly 
made by lay interviewers, who do not themselves diagnose 
and have no skills to recognize illness, let alone do any- 
thing to alleviate it; second, research that may benefit a so- 
ciety cannot be made too onerous, or it will never be done. 

Data protection, and consent relating to the use of 
personal data, generally require that participants are ex- 
plicitly informed of each of the following: 

• Where, in what form, how and by whom data 
relating to them will be held; 

• Who will have access to them; 

• The purpose(s) for which they will be used, with 
guarantees that they will be used for no other 
purpose (this implies that, if data are to be stored 
long-term for other purposes not yet foreseen, at 
least a general explanation of this intention should 
be given); 

• How they will be destroyed once the purposes are 
achieved. 

To reduce the possibility of misuse of personal data, 
the duration of data storage should be as short as pos- 
sible. On the other hand, it is desirable, and regulators 
often require it, that original research data be stored for 
several years, for documentation and to enable detection 
of fraud in science. 

Resources are limited 

Studies that waste them (whether financial resources or 
the willingness of subjects to take part) are unethical be- 
cause of the opportunity cost: other studies will not be 
possible as a result. Under-resourced studies that cannot 
achieve their purpose are likely to be unethical because 
the resources are probably wasted. 

Adherence to these recommendations should ensure 
appropriate allocation of resources to, and their effective 
use in, headache burden research. However, the need for 
efficiency in this context calls for careful deliberation about 
whether a particular new headache epidemiological study is 
required at all, and about the need for high diagnostic pre- 
cision, large sample size and resource-demanding methods 
of data collection. In some circumstances, a new stand- 
alone study can be adequately replaced, with conservation 
of resources, by joining a larger health survey. Disadvan- 
tages of the latter are discussed later (see page 26). 

Unscientific studies waste resources, and are likely to 
be unethical for this reason alone. Worse, they may gen- 
erate misleading results. 



Stovner et al. The Journal of Headache and Pain 2014, 15:5 
http://www.thejournalofheadacheandpain.eom/content/15/1/5 



Page 8 of 30 



Methodological issues 

Important methodological issues arise in each of the 
following: 

• The design of the study; 

• Identification of the population of interest; 

• Control of bias; 

• Sampling from the population of interest; 

• The sample size; 

• Means of access to and engagement with 
participants within the sample; 

• Participation rate; 

• Method of enquiry and survey instruments 
(questionnaires); 

• Case definition and diagnosis; 

• Timeframe; 

• Burden estimation (which aspects, and how?); 

• Data management. 

Study design 
Principles 

The study design should: 

• Match the purpose of the study; 

• Be explicitly justified; 

• Take due account of available resources and the 
general conditions in the area(s) where the study 
will be performed; 

• Be described in sufficient detail that the study can 
be replicated. 

Commentary 

Most studies on headache prevalence and burden have a 
cross-sectional design, since their aims are descriptive. 
In such studies, prevalence and burden are assessed at 
the same point in time. Some headache epidemiological 
studies have more analytical aims: to define causes of or 
risk factors for headache. These more usually have a 
case-control or a cohort design. Case-control studies 
typically compare cases (those who have a disease) with 
controls (otherwise similar persons who do not have the 
disease) for prior exposure to one or more putative risk 
factors. In cohort studies, a group (cohort) of disease- 
free individuals are followed and assessed periodically to 
determine whether they have developed the disease of 
interest. Study subjects within the cohort are categorized 
according to whether or not they have been exposed to 
a suspected causal factor, and incidence rates are com- 
pared in the exposed and unexposed categories. 

The different methodologies demanded by these ap- 
proaches are explained in standard textbooks [21]. They are 
different mostly in how the study samples are selected; the 
principles for collecting data, engaging with participants 
and diagnosing headaches are similar. Therefore, while the 



present recommendations mostly concern cross-sectional 
studies, we believe they will be useful in all study designs. 

The population of interest 

(sometimes referred to as the sampling frame; other 
terms better avoided, which may have similar meaning 
but are sometimes used differently or ambiguously, are 
"source population", "target population" and "reference 
population"). 

Principles 

The population of interest should: 

• Be selected to match the purpose of the study, and 
the selection explicitly justified; 

• Be described in sufficient detail that the study can 
be compared with others. 

Commentary 

The population of interest is the population that it is 
wished to study, and includes every person so defined. It 
is invariably defined geographically, and often also by 
one or more additional characteristics. 

In headache research, the population of interest is usu- 
ally, but not always, the population of a whole country. 
It may also be of a region larger or smaller than a country. 
Sub-populations defined by additional characteristics may 
nonetheless be perfectly legitimate subjects of study: de- 
pending on the aim of the study, the population of interest 
may be restricted to a specific age group (eg, adults of 
working age, adolescents, school or pre-school children), to 
members of groups defined by ethnicity, culture or lan- 
guage, to workers in certain trades or professions or univer- 
sity students, or to people with another particular disease, 
etc. These recommendations remain relevant to studies of 
these more selected groups. 

While they are easily accessed, and for this reason 
often studied, headache patient populations are rarely le- 
gitimate populations of interest, not just because they 
are highly selected but also because the criteria by which 
they are selected (often self-selected) are generally inde- 
terminable. Thus, a study of such populations tells little 
about, and cannot be extrapolated to, either the general 
population or any more broadly-defined population. An 
arguable exception occurs in the case of severe headache 
disorders (see page 25). 

Characteristics of the population of interest 

These (distributions of age, gender, educational levels and 
socioeconomic status, proportions living in rural and urban 
areas, etc.) cannot always be known, but it is advantageous 
if they are. Such knowledge allows evaluation of repre- 
sentativeness of the participating sample, and statistical 
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adjustments with regard to these features when neces- 
sary (see below and page 15). 

Control of bias 
Principles 

• Biases are inevitable, but should be minimized. 

• The likely causes of bias should be identified at 
outset, and due steps taken to manage them. 

Commentary 

Bias refers to systematic rather than random error. It is 
somewhat unavoidable, but not necessarily of consequence: 
this depends on its magnitude and on whether it is differen- 
tial (affecting some participants more than others) or non- 
differential (affecting all participants equally). The potential 
for bias, and the need to avoid or at least minimize it, drive 
many aspects of the planning, execution and analysis of 
a study. 

Biases in epidemiological studies are of two main types: 
selection bias and information (or measurement) bias. 

Selection bias, or a systematic error in taking the sample, 
is commonly the result of an imperfect sampling procedure 
and/or a low participation rate. The result may be a sample 
that is unrepresentative of the population of interest. With 
regard to gender and age composition, statistical adjust- 
ments can be made if these properties are known for the 
population of interest (see above and page 15 and 28). 
However, people suffering from headache are more likely 
to participate in headache studies because they have more 
personal interest in them. This can give rise to a form of 
selection bias usually termed participation bias or interest- 
related bias. 

Sometimes the extent of selection bias can be esti- 
mated in a study among non-participants (see page 15). 

Information bias, or a systematic error in measuring 
disease or exposure status, may be introduced when the 
way in which information is gathered varies systematically: 
for example, when two interviewers, one in a city and the 
other in a rural area, do not perform the interview in the 
same way, consequent differences in results may be erro- 
neously attributed to area of habitation. Similarly, if the 
same interviewer does all interviews in one location first, 
and then all those in another, systematic differences in the 
manner of interviewing over time {eg, due to a learning 
curve) may again yield spuriously different results in the 
two locations. 

Sample selection and avoidance of selection bias 
Principles 

• Sampling introduces multiple opportunities for 
selection bias, which should be recognized and 
controlled. 



• Sampling aims, first and foremost, to produce a 
surveyable group, within the population of interest, 
who are representative of it for all variables that may 
influence the objects of measurement. 

• The sample must be of sufficient size to achieve the 
study purpose(s), and not so large as to waste 
resources. 

• Sampling identifies individuals to be surveyed but 
usually does not of itself provide a means of access 
to these individuals, while the means of access is an 
important consideration to take into account when 
determining the sampling method. 

• The sampling method should be explicitly justified. 

• The sampling method should be described in 
sufficient detail that the study can be replicated. 

• Biases that might have arisen from the sampling 
procedure should be identified and discussed. 

Commentary 

In an ideal study, everyone in the population of interest 
would be included. Occasionally this has been done [22], 
but in most cases it is not a realistic possibility. Instead 
it is necessary to choose from the population of interest a 
smaller, manageable group of people (the sample) to whom 
access is possible. The essential requirement of this sample 
is that it should remain representative of the population of 
interest, because the intention is to generalize the data from 
it to the whole population of interest. "Representative" 
means similar to the population of interest in all properties 
of relevance to (ie, likely to influence) the objects of meas- 
urement (here, headache prevalence and/or burden). There 
is an assumption here that knowledge exists of what 
these properties are, which may not be entirely true. In 
the context of headache, representativeness clearly en- 
compasses age and gender, which are known to affect 
headache prevalence, probably should encompass so- 
cioeconomic class, employment status, area of habita- 
tion (rural or urban) and ethnicity, and possibly, in 
some settings, should also encompass native language 
and/or tribal group. Methods that ensure or at least 
optimize representativeness with regard to a range of 
identified variables such as these are likely, although 
not certain, to achieve the same with other, unrecog- 
nized variables. 

How to obtain a sample representative of the popula- 
tion of interest is a general issue for all epidemiological 
studies, which is dealt with in standard textbooks [21]. 
The main principles and common sampling strategies 
are considered here in the context of headache. 

Probability sampling 

In probability sampling methods, each member of the 
population has an initial probability (which is larger 
than zero) of being selected, and this probability can be 
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accurately determined. These methods include random 
sampling, systematic sampling (every n th person in a 
random list), stratified sampling and (multistage) clus- 
ter sampling (see page 10). All depend on random se- 
lection, and allow calculations of sampling error and 
valid inferences from the sample about the population 
from which it was drawn. The penalty is that, in general, 
they are more resource-consuming than non-probability 
sampling methods. 

In simple random sampling, all individuals within the 
population of interest have an equal probability of being 
selected. The method is vulnerable to sampling error: by 
chance, related inversely to sample size, important char- 
acteristics of the sample such as gender or age distribu- 
tions may not well reflect those of the whole population. 
Stratified sampling reduces this chance by dividing the 
whole population into sub-populations (strata), different 
with regard for example to age, gender and/or habita- 
tion, and randomly drawing the sample from within each 
of these strata, in parts in proportion to their size. 

Both random and stratified sampling are relatively easy 
when an overview of the population of interest exists. 
An overview is usually in the form of a register of all 
members of it. Selection can then be made directly from 
the register (usually by computer). In some countries, a 
census, repeated periodically, lists all inhabitants (except 
illegal immigrants, who may or may not be of interest 
within the study). Other registers that encompass most 
of the population of a country or area may be electoral 
rolls, telephone directories, registers of health insurance 
companies and lists drawn up by commercial market- 
survey or polling companies. When the workforce of a 
company is the population of interest, the list of em- 
ployees is such a register. For stratified sampling, a useful 
register must include not only identities but also, for each 
listed person, the characteristics upon which stratification 
is to be based (eg, age, gender, habitation). 

An overview of the population need not take the form 
of a list of names: a map showing all households in an 
area to be sampled can serve the same purpose. Some- 
times a new map of residences must be made, or old 
maps of an area updated, before a valid selection of resi- 
dences is possible. 

Sampling by telephone is an established method [23]. 
Where telephones (landline or mobile) are widespread, 
but no population overview exists in the form of a complete 
telephone directory, dialling area code(s) followed randomly 
by as many digits as are typical for phone numbers in the 
area(s) (random digit-dialling) is an effective method of 
obtaining a random sample [23]. In many settings, however, 
this method risks bias because telephones are not evenly 
distributed among different age, gender and socioeconomic 
groups. It is important for the calculation of participation 
rate (see page 14) to register which numbers dialled turn 



out to be real telephone numbers and which are non- 
existent. 

Cluster sampling is an alternative to these methods. It 
is useful in areas with no pre-existing overview of the 
population, but often preferable anyway because it is lo- 
gistically efficient, reducing travel costs and time for the 
interviewer when access methodology requires that par- 
ticipants are visited. Usually, cluster sampling involves 
selecting participants only from a limited number of de- 
fined geographical areas {eg, blocks, streets or parts of 
villages, or perhaps schools) that are themselves chosen 
randomly. Areas can also be stratified according to socio- 
economic status, urban/rural location, etc. In multistage 
cluster sampling, smaller areas are selected randomly within 
larger areas, and this is repeated in many stages until the 
requisite number of small, surveyable areas are identified, 
spread around the region or country. In these ultimate 
units, all individuals, or a random selection of individuals or 
households, are contacted. 

Cluster sampling often makes use of maps of residen- 
tial areas in order to select the sample. 

Contacting households rather than individuals 

A household contains a group of people, often but not al- 
ways a family, living together (defined as sharing a com- 
mon kitchen). Special issues arise when sampling employs 
and is dependent upon methods of access (see page 12) 
that lead to contacts initially with households rather than 
single persons (eg, calling door-to-door or by household 
telephone without prior warning). Generally, in headache 
studies, only one of a family should be selected, because 
members of families are similar genetically, share their en- 
vironment and have common lifestyles, effectively redu- 
cing variance when two or more are included. (Of course, 
this is not the case when the intention is to study familial 
occurrence). Selection bias can arise very easily, because 
certain types of person within a household are more likely 
to stay at home, open the door or answer the phone. This 
may vary according to time of day. Consequently, rather 
than select whoever happens to answer, the interviewer 
should list all members of the household, determine who 
among them are eligible and then select the participant 
randomly from those that are. 

Replacements 

The study protocol should set out the method for select- 
ing replacements when it is impossible to contact a se- 
lected person. This can be achieved by pre-selecting more 
individuals than needed within each stratum, or more resi- 
dences within each area, allowing for a defined non- 
contact rate based on expectation (perhaps informed by a 
pilot study [see page 21]). Otherwise it can be achieved by 
extending the sampling process (for example, by visiting 
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more randomly-selected households than were initially 
specified). 

As a prior requirement, the protocol should define "im- 
possible to contact" in a way that avoids bias. This is dealt 
with below, under "Accessing and engaging participants". 

Non-probability sampling 

With non-probability sampling methods, participants are 
selected from the population in some non-random man- 
ner, so that some members have reduced or no chance of 
being selected while others have enhanced chances, or the 
probability of selection cannot be determined. Methods 
include convenience sampling (selecting the part of the 
population that is conveniendy to hand - patient samples 
being an example), judgment sampling (selecting those 
who are judged most likely to provide the information of 
interest), quota sampling (selecting quotas of the popula- 
tion fulfilling particular traits) and snowball sampling (al- 
lowing selected participants to recruit future participants 
among their acquaintances). 

In non-probability sampling, selection bias is always 
there; the degree to which the sample differs from the 
population of interest is uncontrolled and may be 
large. It may be knowable, by comparing the sample 
for key characteristics (age, gender, etc.) with the whole 
population using published statistics, but, generally, se- 
lection bias is impossible to assess. Hence, if non- 
probability sampling is the only available option, serious 
consideration should be given to whether the study is 
worthwhile. 

Sample size determination 

Sample size depends on the prevalence of the disorder 
and the precision of the estimates needed, the latter be- 
ing related to the purpose of the study. Additionally and 
importantly, sample size raises issues of resources and of 
ethics (see page 5). A larger sample clearly requires and 
consumes more resources. It is an unethical waste of re- 
sources to use unnecessarily large samples. On the other 
hand, it is futile, and therefore unethical, to use samples 
too small to support the purpose of the study. 

In almost all cases, the required sample size is independ- 
ent of the size of the population. An exception occurs when 
the population is very small, so that the sample is a large 
proportion of it (eg, the few hundred employees of a com- 
pany). This is because the probability for any individual of 
being selected changes significandy as the sampling process 
continues. 

As to precision, Table 2 shows estimates of the mar- 
gin of error for sample sizes ranging from 10 to 10,000. 
The margin of error is dependent on the prevalence of 
the condition under scrutiny. With decreasing preva- 
lence the absolute margin of error decreases, but the 
relative margin of error (ie, margin of error/prevalence) 



Table 2 Margin of error* (95% confidence interval) 
according to sample size and prevalence of the condition 



Sample 
size (n) 




Prevalence of condition 




50% 


30% 


10% 


3% 


1% 


10 


31.6 


29.0 


19.0 


10.8 


6.3 


20 


22.4 


20.5 


13.4 


7.6 


4.4 


50 


14.1 


13.0 


8.5 


4.8 


2.8 


100 


10.0 


9.2 


6.0 


3.4 


2.0 


200 


/.I 


6.5 


4.2 


2.4 


1.4 


500 


4.4 


4.1 


2.7 


1.5 


0.9 


1,000 


3.2 


2.9 


1.9 


1.1 


0.6 


2,000 


2.2 


2.0 


1.3 


0.8 


0.4 


5,000 


1.4 


1.3 


0.8 


0.5 


0.3 


1 0,000 


1.0 


0.9 


0.6 


0.3 


0.2 



•calculated as 2 standard deviations (SDs) of the prevalence where 
SD = V(prevalence*[1-prevalence]/n). 



increases. For a condition with a prevalence of 50% it 
appears that, by increasing the sample size from 100 to 
1,000, the margin of error is decreased from 10% to 
3.2% (absolute). Thus, if the sample size is 1,000, and 
50% report headache, there is 95% probability that be- 
tween 46.8 and 53.2% of the total population have 
headache. This is acceptable precision for many pur- 
poses, while the gain from sample sizes of >2,000 is 
small and often not worthwhile. 

However, if the aim is to investigate a relatively rare con- 
dition with an expected prevalence of 1%, an absolute mar- 
gin of error of 0.6% may be very unsatisfactory, in which 
case the sample size must be increased considerably. The 
same is true when estimates within or comparisons be- 
tween subgroups (eg, men vs women, rural vs urban) are 
part of the purpose: the sample size must be calculated to 
include sufficient of the population in the smallest sub- 
group. To avoid inflating the overall sample unnecessarily, 
it is possible to "oversample" persons of that particular sub- 
group (ie, select more than the proportion in the popula- 
tion). Persons of this particular subgroup then have a 
higher (but known) chance of being selected than those in 
other subgroups that were not oversampled, and correction 
is necessary when calculating overall prevalence. 

A larger sample may also be needed to estimate burden 
than to estimate prevalence, because burden is not distrib- 
uted equally among cases: most of it is accounted for by a 
minority of those with the disorder. Thus, 3-4% of the popu- 
lation have most of the burden of migraine [24]; among all 
people with migraine, TTH or MOH, the relatively few with 
MOH have the highest individual burden [6]. 

Cluster sampling is assumed to reduce natural vari- 
ance, and therefore requires larger sample sizes to obtain 
the same precision of estimates [25]. 
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Accessing and engaging participants 
Principles 

• Access is necessary to engage participants in the 
survey enquiry. 

• In some cases, sampling of the population of interest 
and access to those identified as members of the 
sample are achieved simultaneously, in a single 
process. 

• Incomplete access to all identified members of the 
sample introduces multiple opportunities for 
selection bias, which should be recognized and 
controlled. 

• The access method should be explicitly justified. 

• The access method should be described in sufficient 
detail that the study can be replicated. 

• Access to potential participants does not of itself 
provide a means of engagement with them, but 
means of engagement is an important consideration 
to take into account when determining the means 
of access. 

• Incomplete engagement of accessed members of the 
sample introduces further opportunities for selection 
bias, which should be recognized and controlled. 

• The method of engagement should be explicitly 
justified. 

• The method of engagement should be described in 
sufficient detail that the study can be replicated. 

• Biases that might have arisen from incomplete access 
and/or engagement should be identified and discussed. 

Commentary 

Once identified, members of the sample must be contacted 
(access), and their willing commitment to the enquiry pro- 
cured (engagement). Upon the latter depends how care- 
fully and completely they will respond and, therefore, data 
quality. In many cases, procedures to identify and access 
participants are the same, or closely bound together. How- 
ever, sampling from registers does not assist access to 
those subjects. 

Access 

Means of access clearly depends on what means of com- 
munication exist in the area, on availability of telephone 
registries, e-mail or home address lists and/or up-to-date 
maps of residential areas, and on infrastructure (eg, means 
and ease of travel). Well-tried and potentially effective 
methods include visiting door-to-door and calling by tele- 
phone (landline or mobile), both usually done without 
prior warning (cold-calling). Alternatively, letters may be 
sent by mail or e-mail when participants have been se- 
lected from registers that provide addresses. In such cases, 
in some settings, participants can be invited to come to 
the office of the interviewer. All of these are compatible 



with probability sampling. Other methods involve non- 
probability sampling: for example, stopping prospective 
participants in the street. This results in a convenience 
sample, as does using lists of telephone numbers or e-mail 
addresses that happen to be available rather than complete 
registries. 

Cold-calling at households tends to yield a higher par- 
ticipation rate (among people who are at home) than tele- 
phone interviews, which are easier for the interviewee to 
terminate. In both cases, selection bias can arise easily, be- 
cause certain types of person within a household are more 
likely to stay at home, open the door or answer the phone. 
Avoidance of this bias is discussed above (see page 9). 

Accessing participants by mail is cheap, and by e-mail 
even cheaper, but these methods have two major draw- 
backs. First, they presume the use of self-administered 
questionnaires (see page 13). Second, selection bias is 
unavoidable because certain types of people are inher- 
ently less likely to reply. 

Inviting prospective participants to the office of the 
interviewer is highly problematic. In favour of this ap- 
proach is that participants not only are seen face-to-face 
but also can be examined when necessary. This may be 
the only practical access method when detailed interro- 
gation, examination or investigation is required. In other 
circumstances it is not ideal: while convenient for the 
interviewer, it is time-consuming for the participant, which 
risks both lowering the participation rate and introducing 
bias with regard to who are willing and have the time. 

Repeated attempts at access 

As a means of increasing participation rate, and perhaps 
reducing participation bias, multiple attempts may be 
necessary to contact persons who do not answer first 
time: by sending one or more reminder letters, or by re- 
peating telephone calls or house visits. The study protocol 
should define not only how many (commonly three) but 
also when additional contact attempts are made before a 
person or a household is deemed impossible to contact. 

Repeated contacts have limited effect on selection bias 
in responses to letters or e-mails. But selection bias in 
cold-calling can and should be controlled, especially bias 
potentially arising because certain types of household are 
more likely to be empty, or their phones unanswered, at 
particular times of the day (eg, working households will be 
selectively uncontactable during normal working hours; 
older people may not open doors to strangers in the even- 
ing). The protocol should adopt a systematic approach to 
this, stipulating a schedule for repeat contacts that makes 
provision for these and other factors that are likely. 

Engagement 

Engagement entails the procurement of willing cooper- 
ation, so that the enquiry, which usually means obtaining 
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responses to a questionnaire, can be completed. Success 
or failure in engagement directly affects participation rate 
and, therefore, participation bias (see page 9). Induce- 
ments to participate (see page 7) may increase both par- 
ticipation rate and participation bias. 

While the method of engagement depends upon the 
means of access, it is also necessarily determined by cer- 
tain characteristics of the prospective participants: in par- 
ticular, literacy, language, culture and cognitive ability. 

Administration of a questionnaire or other survey in- 
strument can be achieved by interview, which may be 
face-to-face or by telephone. Self-administered question- 
naires can be mailed or e-mailed, handed out in certain 
settings (school, workplace, doctor's office, etc.) or distrib- 
uted by internet. Sometimes, subjects may be engaged in 
groups or through a third person. 

Face-to-face interviews are the most direct method of 
engagement, and, uniquely, allow participants to be ex- 
amined. They are almost the only method useful in pop- 
ulations with poor literacy. Their big drawback is that 
they are time-consuming and therefore expensive. Tele- 
phone interviews are almost as direct but, of course, 
physical examination cannot be performed. Both face-to- 
face and telephone interviews allow clarification of ques- 
tions. This is generally thought to lead to more accurate 
answers, but clarification can give rise to information 
bias, firstly because the information given to those who 
ask is different from that given to those who do not, and 
secondly because different interviewers may give different 
clarifications. Therefore, if clarification is allowed, there 
should be clear, pre-specified limits to the extent of it, 
aiming to ensure that all interviewers do it in the same 
way. The pilot study (see page 21) is a means of anticipat- 
ing these potential difficulties, identifying which questions 
may cause trouble and the nature and extent of clarifica- 
tions they may require. 

In a computer-assisted telephone interview (CATI), lay 
interviewers follow a script driven by a computer pro- 
gram; the responses are entered directly into the computer 
and in-built branching logic will skip questions that are 
not relevant according to previous answers. Face-to-face 
interviews can be computer-assisted in the same way. 
Otherwise, similar methodology can be incorporated into 
face-to-face interviews by careful questionnaire design. 
These methods permit only pre-scripted clarifications. 

Self-administered questionnaires are a relatively cheap 
method, but one that requires a high degree of literacy 
and some familiarity with answering questionnaires. The 
lack of engagement with an interviewer provides no en- 
couragement to respond and no opportunity for clarifica- 
tions; participation rates are generally low, and incomplete 
returns are common. 

Engagement in groups can be a cost-effective way of 
performing a study in some settings {eg, by a teacher 



posing questions about or descriptions of headaches 
collectively to the pupils in a classroom). Engagement 
through a third person may be the only way to gain contact 
with or information from some participants. Among small 
children, most or part of the information must be obtained 
through parents, or from teachers in the case of school 
children. In some cultures, people affected by headache 
can be engaged only through village elders or heads of 
households. All such methods risk misunderstandings, and 
this risk must be evaluated. Additionally, lower sensitivity 
and specificity for detecting headache must be expected 
from such remote engagements than from making direct 
contact with everyone in the sample individually. 

Interviewers 

Careful selection and adequate training of interviewers 
are of paramount importance, whether interviews are 
conducted face-to-face or by telephone. 

As to selection of interviewers, interviewing is a skill in 
itself, and it should not be assumed that health personnel 
such as nurses, medical students or, especially, doctors are 
the best qualified to do it. Unless the interviewer is a 
headache specialist (defined here as a physician skilled in 
headache diagnosis and familiar with the culture and lan- 
guage of the respondent), diagnoses should be made not 
by the interviewer but by later applying an algorithm to 
the questionnaire responses (see page 20). It is therefore 
doubtful, in most surveys, whether clinical skills are im- 
portant: it may be better to engage professionally-trained 
interviewers who understand interview methodology, are 
given basic understanding of the purposes of key ques- 
tions and who follow the questionnaire and operations 
manual. 

On the other hand, if the interviewer is a headache spe- 
cialist, multiple diagnoses can be made in the same patient, 
where appropriate, and, if examination and supplementary 
investigations are made, secondary headaches can be diag- 
nosed. No validation of the diagnostic method is needed 
(see page 20) since it can be assumed that optimal diagnos- 
tic methods are employed. Headache specialists, of course, 
may not explicidy use ICHD diagnostic criteria: they have 
at their disposal and are likely to apply, as in the clinic, a 
broader, experienced-based and more inclusive set of cri- 
teria for diagnosis, which ICHD-based questionnaires can 
never match. High sensitivity can be expected for detecting 
relatively minor headache complaints and rare headaches. 

In most low-income countries, headache specialists 
are simply not available. 

Adequate training embraces a clear understanding of 
the nature and purpose of the survey, and a recognition of 
which questions are of particular importance or may need 
clarification (see page 13). If several interviewers are used, 
training should be identical for each of them to ensure 
that data are collected in the same way by all, without 
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introduction of information bias (see page 9). While mul- 
tiple interviewers almost inevitably cause some degree of 
variability, more interviewers reduce the duration of the 
study, which, if long, can also be a source of variation. 

Participation rate and non-participation 

(avoided terms: response rate, responder rate). 



criteria. A flow diagram is an effective way to demonstrate 
how the participating sample was made (see Figure 2). 

In almost all cases, non-participation results principally 
from refusal. Additionally, some of those who consent to 
participate may nevertheless be unable to cooperate use- 
fully, perhaps answering only a few questions or providing 
conflicting responses. 



Principles 

• The participation rate must be calculated. 

• Low participation rate increases risk of selection bias. 

• Likely causes of non- participation should be recognized 
at the outset, and controlled. 

• A listing of the reasons for non-participation should 
be given, ideally in subgroups according to age 

and gender. 

• Biases that might have arisen from non-participation 
should be identified and discussed. 

Commentary 

Although no universal definition of it exists [8], partici- 
pation rate (strictly a proportion rather than a rate) is 
generally understood as the proportion of those selected, 
contacted and eligible who actually participate meaning- 
fully in the study. In Figure 1, it is given by E/C. As this 
figure shows, calculation of the participation rate ex- 
cludes those in the preselected sample who were not 
contactable because they had died or moved away since 
the study was planned, or because no-one answered the 
phone or opened the door. It also excludes those who 
were contacted but found to be ineligible because it was 
not possible to ask them for consent, they did not fulfil the 
eligibility criteria (eg, wrong age or gender) or it could not 
be determined whether or not they fulfilled the eligibility 



Bias through non-participation 

Only a very high participation rate guarantees representa- 
tiveness of the participating sample. This is rarely achieved, 
except among "captive" samples such as pupils interviewed 
during school hours. Participation rates of >80% and >70% 
are considered excellent and acceptable respectively, but 
even these do not automatically secure representativeness. 
On the other hand, a low participation rate does not ne- 
cessarily mean that the participating sample is unrepre- 
sentative: this depends on the factors responsible for 
non-participation [8]. Because participation rate tends 
to vary between different subgroups of the eligible sam- 
ple (eg, it may be particularly low among young males), 
it influences the overall results (ie, is a source of bias). 
In other words, both how many and who actually par- 
ticipate in the study are crucial to representativeness of 
the participating sample. 

For these reasons, participation rate needs to be known 
overall (E/C in Figure 1), which is relatively easily achieved. 
Ideally it should be known also in relevant subgroups. 
This requires collection of a minimal dataset from non- 
participants (see below), including the characteristics most 
likely to influence the object of measurement (age, gender 
and socioeconomic status, when this object is headache). 
This may not be possible. In default, comparisons for 
these characteristics between the participating sample and 
the population of interest can be made, if they are known 



A: Preselected 
sample 

B: Contactable 
sample 





C: Eligible sam 




Figure 1 Definition of participating sample. 



Dead, moved away or not 
answering, or not possible to ask 
about eligibility or consent 



Not fulfilling criteria for 
participation 



Not consenting to participate 



Not completing interview 
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Not accessed, or not 
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to participate, n = 



Consenting sample 
n = 



Not completing interview, or 
missing or conflicting data, 
n = 



Participating sample, 
n = 

Figure 2 Flowchart of participation. 



for the latter. Then, within limits, imbalances can be cor- 
rected statistically by weighted adjustments of the raw 
data (see pages 9 and 28). 

Furthermore, all reasons for non-participation (eg, de- 
clined to participate, too sick to be interviewed, did not re- 
turn questionnaire, questionnaire not filled in appropriately, 
etc.) should be listed, and taken into account in evaluating 
representativeness of the participating sample and the likeli- 
hood, magnitude and influence of resulting bias. 

A pilot study may establish the likely participation rate 
before all resources are committed to a study that may fail 
(see page 21). It also anticipates causes of non-participation. 

Non-participant study 

When participation rate is low, representativeness 
may sometimes be estimated through a study of non- 
participants. Such studies are never easy to perform 
but are possible, in some settings, when the initial sampling 
was from a register including contact details. Enquiry is 
usually by telephone, calling a random sample of non- 
participants. It is necessarily limited to a few key questions 



[eg, age, gender, the screening question(s) for headache 
[see page 18], perhaps one on headache frequency and a 
very few on diagnosis). This minimal dataset will at least 
show whether non-participants are similar to or very dif- 
ferent from participants in the main study, and therefore is 
highly valuable in assessing various types of selection bias. 
An additional question asking why they did not participate 
initially will also inform discussion of selection bias, and 
may help improve the design of future studies. 

Unfortunately, such studies are impossible among 
people not pre-registered, since they cannot be accessed. 

Method of enquiry 
Principles 

The method of enquiry should: 

• Be suited to the purpose of the study; 

• Encourage participation; 

• Capture the necessary data; 

• Take due account of available resources, the general 
conditions in the area(s) where the study will be 



Stovner et al. The Journal of Headache and Pain 2014, 15:5 
http://www.thejournalofheadacheandpain.eom/content/15/1/5 



Page 16 of 30 



performed and the characteristics of potential 
participants; 

• be tested in the specific setting of the study; 

• be adequately described. 

Commentary 

One or more study instruments guide the process of en- 
quiry so that it captures all of the data needed for the 
study's purpose(s). 

Study instrument 

A structured questionnaire (ie, prescribed questions with 
predefined answer options) is usual, both for headache case 
diagnosis and for enquiry into headache burden. To some 
questions, open answers (eg, number of days with head- 
ache, or names of medicines) are preferable; otherwise, 
open questions may be difficult to interpret and categorize 
and, for diagnosis, do not permit algorithmic determin- 
ation. An alternative approach to diagnosing headache is a 
recognition-based method, presenting descriptions (case 
vignettes) or pictorial representations of different head- 
aches [26]. 

In a structured questionnaire, depending on the study 
purpose(s), any or all of the following elements may be 
needed: 

• Identifier(s); 

• Demographics: age, gender, education, employment, 
income (personal and/or household), habitation 
(urban, rural), ethnicity (when relevant), etc.; 

• Screening and sieve questions; 

• Diagnostic questions; 

• Questions on headache symptom burden: headache 
frequency, duration, intensity, etc.; 

• One or more other elements of burden depending 
on the purpose(s) of the study: eg, disability, time 
loss, family burden, cost, etc. 

Irrelevant questions are irritating to participants, and 
create unnecessary workload. Questionnaires should be 
parsimonious, and not include any questions that do not 
contribute to the study's purpose(s). Further, question- 
naires should avoid enquiries that are individually irrele- 
vant, by directing respondents past sections that are not 
applicable. CATI methods (see page 13) are designed ex- 
pressly to do this, but it is not difficult to construct writ- 
ten questionnaires to do so also. 

If all questions are pertinent to the survey, as they should 
be, each one needs to be answered clearly by each partici- 
pant. However, there are certain questions such as the 
screening question (see page 18), diagnostic questions and 
key questions on burden that cannot be allowed to remain 
unanswered if the respondent is not to be categorized as a 



non-participant (see page 15). These should be highlighted 
to the interviewer. 

Recognition-based diagnosis (presenting descriptions or 
pictorial representations of different headaches) is appropri- 
ate and convenient - and may give the most accurate 
results - in certain settings and populations (espe- 
cially young children). But mixing recognition-based and 
question-based enquiries is unwise, because the former is 
highly leading: if an image depicting unilateral headache is 
recognized, it cannot be followed by the question "Is your 
headache on one or both sides?" [27] . 

Preparing and testing the questionnaire 

The quality of the questionnaire is fundamental to the 
quality of the entire study: nothing can compensate for 
failures in data collection. Time and resources are well 
spent in developing a good questionnaire. 
The following should be considered: 

1) Content 

How many diagnoses are necessary, and what 
aspects of burden are to be measured? 

2) Layout 

Clarity and ease of use reduce error rates. 

3) Intelligibility and acceptability 

The need for these, to encourage participation and 
capture the necessary data, is obvious; but they are 
not readily achieved, and it should not be assumed 
that they have been. A pre-pilot study is recommended 
to show that questions are culturally inoffensive and 
that the length of the questionnaire is acceptable. For 
this limited purpose, a small convenience sample 
can be drawn from patients in clinic, and, if the 
questionnaire has been developed in English, a 
headache-expert interviewer can translate questions as 
they are asked. A larger pilot study (see page 21) is 
needed after translation to test whether the questions 
are understood correctly and discover those that may 
cause problems and require clarification (see page 13). 
This study can also use a convenience sample, but it 
must be drawn from the population of interest. 
Feedback from it may show that the questionnaire 
needs amendment(s) (and re-translation and 
re- testing). 

Translating the questionnaire 

Questionnaires should almost invariably be presented in 
the native language of the intended participants. Many 
of the key concepts used in the diagnosis of headache, 
even the concept of headache itself, have variable mean- 
ings in different languages. Meticulous translation is ne- 
cessary, and requires a rigorous translation protocol with 
backward and forward translations [28] . 
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Validating the questionnaire 

Validation (proof in use) of at least the diagnostic part of 
the questionnaire is highly desirable in the language and 
specific setting of the study. Recommendations with re- 
gard to this are detailed in the section below on case 
definition and diagnosis. 

The HARDSHIP questionnaire: standardized questions for 
prevalence and burden studies 

This modular questionnaire for studies of headache preva- 
lence and burden, published alongside these recommenda- 
tions [29], is an example of a survey instrument including 
the elements described above and following the principles 
set out in the sections below. It has been translated into 
several languages and used in many countries with diverse 
cultures. It includes a diagnostic module based originally 
on ICHD-II (and now on ICHD-3 beta [30], with neces- 
sary modifications (see page 18), and validated against 
headache experts' diagnoses in many different settings. 

Case definition and diagnosis 
Principles 

• Headache caseness should be defined precisely, 
including its timeframe. 

• The screening question should be neutral, unless a 
good reason exists for it to be otherwise, and 
reported verbatim. 

• Diagnoses should be based on current ICHD 
criteria, but modifications are always necessary; 
these should be both explained and justified. 

• Diagnostic instruments should be applied in the 
language of intended participants. 

• Diagnostic instruments require validation in the 
language and specific setting of the study. 

• Diagnoses should be made algorithmically from the 
recorded responses to diagnostic questions, and not 
by the interviewer. 

• The value of diagnosing different headache types 
and subtypes should be critically assessed in the 
light of the study's purpose. 

Commentary 

In studies whose purpose is to assess headache preva- 
lence, or describe its characteristics, case definition or 
caseness (ie, what precisely is meant by "headache") is of 
obvious and fundamental importance. It can greatly in- 
fluence the results. Of equal importance is clarity about 
how headache, and any of its types or subtypes that are 
relevant to the study, are diagnosed (see page 18 and 
page 19). For example, "migraine" in some studies has 
included all of its subtypes (ICHD-II codes 1.1-1.6) but 
in others only migraine with or without aura (ICHD-II 
codes 1.1 and 1.2). "Migraine with aura" itself must be 



clarified because, within ICHD-3 beta, it includes the sub- 
forms migraine with typical aura, migraine with brainstem 
aura and typical aura without headache. The terms "mi- 
graine" and "tension-type headache" have in some studies 
included both episodic and chronic forms of each and in 
others been restricted to the episodic forms, with the 
chronic forms subsumed within the general category of 
headache occurring on >15 days/month. 

Timeframe of headache 

Caseness usually implies the presence of an active head- 
ache disorder, which was defined in ICHD-II as charac- 
terized by any occurrence of headache (either generally 
or of the requisite type) during the last year [11]; hence 
1-year prevalence is most used, and it allows the most 
comparisons with previous studies (but see page 4). 
Shorter timeframes (6-month and 3-month) are also quite 
common; while they yield data on active headache disor- 
ders, some of those characterized by low-frequency recur- 
rence are missed (which may not matter). 

The limitations of recall (correctly remembering what 
happened when) are an important factor in the gener- 
ation of information bias. They are greater in the elderly, 
so introducing differential bias (see page 9). They are 
also, obviously, of greater impact over longer periods. Re- 
call of headache occurrences during the previous 3 months 
is presumably more accurate than recall of those during 
the last year; this is certainly true of recall of the character- 
istics and consequences of each occurrence. Very short and 
recent timeframes such as 1-day prevalence (ascertained 
by questions such as "Do you have headache today?", or 
"Did you have headache yesterday?") avoid recall problems 
almost altogether. An estimate of 1-day prevalence does 
not, alone, describe the proportion of the population with 
an active headache disorder (for discussion of the potential 
terminological confusion relating to this, see page 4) but is 
able to yield very accurate information on diagnosis, func- 
tion and other aspects of burden (see page 24). 

As to headache yesterday, this is most correctly esti- 
mated when respondents are contacted directly by face-to- 
face interview or telephone, and not given the opportunity 
to choose when to answer. If the question is posed by let- 
ter, for example, it is conceivable (indeed not unlikely) that 
persons with headache on the day of receiving it will post- 
pone answering until the first headache-free day, resulting 
in a falsely-high reported prevalence of headache yesterday. 

Lifetime prevalence ("Have you ever had headache?") 
includes both those with active headache and those who 
have previously had headache but no longer do so. As 
already observed, recall of headache long ago is prob- 
lematic; this is true especially in the elderly while, in the 
young, not only may recall be better but also there is no 
"long ago". Lifetime prevalence has mostly been used for, 
and is of most interest in, the rarer headache disorders 



Stovner et al. The Journal of Headache and Pain 2014, 15:5 
http://www.thejournalofheadacheandpain.eom/content/15/1/5 



Page 18 of 30 



such as cluster headache, because it broadens the defin- 
ition of caseness and so increases the likelihood of observ- 
ing cases. Lifetime prevalence is very relevant also in 
genetic epidemiological studies, but for a different reason: 
there is need to eliminate those who have ever had the 
disorder from control groups. 

A single study can ask participants about each of life- 
time, 1-year and 1-day prevalences. 

Case ascertainment: screening and sieve questions 

Most studies use a two-stage procedure: participants are 
first asked whether they have headache or not (screening 
question), and only those answering affirmatively are posed 
the more detailed diagnostic questions. This is time saving, 
and avoids irrelevant enquiry (see page 16), but there is a 
potential penalty: a negative answer to the screening ques- 
tion terminates further enquiry even though it can be false. 
In some studies, screening employs a relatively simple self- 
administered questionnaire, whereas, in the second stage, 
screen-positive participants are subjected to more thor- 
ough face-to-face or telephone interviews. This method is 
less resource-consuming than interviewing all respondents 
personally, while retaining the diagnostic precision of the 
personal interview. However, it is probably more at risk of 
false-negative screening, and sensitivity is expected to be 
lower, particularly for minor headache complaints, than 
when all respondents are interviewed personally. 

The screening question(s) should be constant through- 
out the study, and set out verbatim. It sets boundaries to 
the definition of headache caseness; therefore, the study 
results depend to a large degree on its exact phrasing, 
and the wording of it is determined by the purpose(s) of 
the study. A neutral question is "Have you had headache 
during the last year?"; such a question will include, as 
cases, those for whom headache is only a minor nuis- 
ance, and will therefore yield a higher prevalence than a 
non-neutral question specifying degree, frequency, inten- 
sity or circumstances of its occurrence [12]. Examples of 
the latter are: "Have you suffered from headache?"; "Have 
you had frequent headache?"; "Have you had bad head- 
ache?"; "Have you had headache not due to hangover, 
head injury, flu or common cold 7 ." 

One or more additional questions function as a sieve, 
separating out participants according to thresholds of fre- 
quency {eg, >l/month), intensity or functional impair- 
ment. Through the combination of these questions, a case 
definition can be arrived at that is more interesting from a 
medical or public-health perspective. At the same time, it 
is known how many participants have any headache and 
what proportions are sieved by each sieve question. 

Diagnosing headache types 

In most studies, whether of headache prevalence or burden, 
it is of interest to distinguish between different headache 



types. Diagnostic questions should relate to the criteria of 
the International Classification of Headache Disorders (the 
most recent being the 3rd edition, published in a beta ver- 
sion [ICHD-3 beta] but for immediate use [30,31]) because 
these, by almost universal acceptance, are the common lan- 
guage of definition, and description, of headache disorders. 
Therefore, unless it is the express purpose of the study to 
investigate a new or proposed revision of the ICHD criteria, 
no alternative can be considered. 

But there are problems with this. For epidemiological 
purposes, these criteria must usually be built into a struc- 
tured questionnaire (see pages 16 and 17), although this is 
not how diagnoses are usually made in clinic. ICHD criteria 
were not designed for epidemiological enquiry, and are not 
particularly well-suited to it. Several deliberations are ne- 
cessary when planning to employ them in such studies. 

First, their strict application requires that all partici- 
pants are personally interviewed, and in many cases ex- 
amined, by a competent clinician. In most cases, even if 
possible, this would be a questionable use of resources. 
Second, ICHD criteria are expressed in technical language, 
and must be translated for lay participants without loss or 
distortion of meaning. Third, and partly consequent upon 
this, certain criteria distinguishing between migraine and 
TTH have been noted empirically to pose particular prob- 
lems in epidemiological surveys. It has been found difficult 
to gather correct responses on headache duration [2], re- 
quiring patients to consider untreated attacks, which they 
may never have, or last had long ago. There are no easy 
lay explanations of photo- and phonophobia, which are 
technical concepts. Even more difficult is to specify what 
degrees of photo- and phonophobia fulfil migraine criteria 
in ICHD. Headache specialists are trained to sort these 
things out, but other methods of data collection struggle 
to do so. 

For these three problems there is no perfect solution. 
Pragmatically, there have to be modifications of these 
criteria, and, because these involve departures from ac- 
cepted criteria, they should then be tested and shown to 
be methodologically sound in a validation study that com- 
pares diagnoses obtained by the study instrument with 
those of one or more headache experts ("gold standard": 
see page 20). 

A fourth deliberation concerns headache occurring 
on >15 days/month: it is generally recognized that pre- 
cise diagnosis is difficult at a single encounter with par- 
ticipants (or, indeed, patients in clinic) who have such 
headache. But they, and especially the proportion of them 
with MOH, are of particular public-health interest. 

A fifth deliberation relates specifically to migraine with 
aura, because aura is very difficult to diagnose with any- 
thing close to certainty by questionnaire. As noted earlier, 
migraine with aura includes, within ICHD-3 beta, the sub- 
forms migraine with typical aura, migraine with brainstem 
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aura and typical aura without headache. Unless migraine 
aura is a main object of the study, it is unwise to allow 
the diagnosis of aura to dictate the diagnosis of migraine 
with aura. 

Sixth, and bearing on the seventh, certain questions can- 
not remain unanswered if a diagnosis is to be made. When 
it must be established whether a diagnostic characteristic 
(usually a symptom) is present or absent, the relevant 
question in a structured questionnaire usually offers only 
the response options "yes" or "no", since "don't know" is 
unhelpful. But forcing a response from a participant who 
is uncertain is self-defeating. It is arguable that "don't 
know" implies absence, since presence of a symptom cre- 
ates awareness of it (good examples being photophobia 
and phonophobia), and only absence leads to uncertainty. 
This requires empirical testing. 

The seventh deliberation relates to "probable" diagno- 
ses defined first by ICHD-II [11] (see below). 

"Definite" and "probable" diagnoses 

ICHD-3 beta, as did ICHD-II before it, allows the diagno- 
sis of "probable" primary headache disorders according to 
a general rule: when all but one diagnostic criterion for 
disorder X are met, the diagnosis is "probable X", provided 
that not all criteria are met for another disorder Y (in 
which case the correct diagnosis would be Y). 

The concept of "probable X" has an important purpose in 
clinical management, providing a basis for a treatment plan 
pending later diagnostic confirmation. In epidemiological 
surveys, later confirmation is not expected: initial diagnoses 
of probable X have no opportunity to be amended either to 
X or to another diagnosis. In the specific cases of migraine 
and TTH, diagnosis of the former depends upon the pres- 
ence of specific features {eg, nausea, vomiting, photophobia 
and phonophobia, aggravation by physical activity), while 
diagnosis of TTH depends essentially upon the absence of 
these same features. By ICHD-3 beta rules, the presence of 
all but one feature of migraine is not consistent with a diag- 
nosis of TTH, and must lead to a diagnosis of probable mi- 
graine. The same is true of probable TTH. Yet there are 
unavoidable uncertainties in questionnaire-based diagnoses, 
with the empirical consequence that about half of such 
diagnoses are "probable" according to ICHD rules [14,15] 
while, in validation studies conducted in sub-groups of the 
same populations, fewer than 10% of expert diagnoses are 
probable. 

It is difficult to see that the concept of "probable X" 
serves any purpose in studies of population health. Ra- 
ther, its application detracts from diagnostic accuracy 
[32], and it can be seen why this is so. All questionnaire- 
based diagnoses of primary headaches are "probable", partly 
because of these uncertainties and partly, and in particular, 
because adequate enquiry to exclude secondary headaches 
cannot be undertaken. Any distinction in epidemiological 



studies between "definite X" and "probable X" is arbitrary. 
Almost all cases of "probable migraine" are of episodic 
headache that does not meet criteria for TTH, and are 
therefore more probably migraine than any other type of 
headache. Likewise, almost all cases of "probable TTH", not 
meeting the criteria for migraine, are more probably TTH 
than anything else. Furthermore, however the distinctions 
between "definite" and "probable" are made, "probable mi- 
graine" and "probable TTH" are not separate diagnostic en- 
tities, and the practice of considering them as though they 
were is both illogical and unhelpful; it needs to be strongly 
discouraged. 

As an approach that is both pragmatic and practical in 
epidemiological studies, we recommend that "definite mi- 
graine" and "probable migraine" be reported separately, 
then combined ("all migraine"), and similarly "definite 
TTH" and "probable TTH" ("all TTH"). The important 
proviso is that the diagnostic algorithm (see page 19) ap- 
plies criteria for these diagnoses in the correct order, 
which is dictated by ICHD [30]. 

An important function of validation studies (see page 
20) is to provide proof for this approach. Agreement be- 
tween questionnaire-based and expert ("gold-standard") 
diagnoses is usually low for probable diagnoses (empiric- 
ally, as noted above, questionnaire rates of probable diag- 
noses are much higher than expert rates), but much 
higher for definite and probable combined. 

In the specific case of MOH, this is diagnosable both 
in the main survey and in any validation exercise only as 
an association of medication overuse with frequent head- 
ache. Without evidence of causation, all cases are probable 
MOH. 

How many different diagnoses in the study? 

ICHD-II and ICHD-3 beta describe some 200 different 
headache diagnoses, including subtypes and subforms. 
While the principles presented in these recommendations 
are relevant for all headache diagnoses, in epidemiological 
studies it is neither feasible nor generally desirable to at- 
tempt diagnosis of more than a few of these. From the 
point of view of headache burden, migraine, TTH and 
MOH are of the greatest interest. 

Multiple diagnoses call for more extended enquiry, 
which, probably, leads to lower diagnostic precision [27]. 
Many studies have focused on a single diagnosis (usually 
migraine), meeting their purpose. 

Diagnostic algorithm for headache 

A diagnostic algorithm, applied later to the recorded re- 
sponses, separates the interviewer from the diagnostic pro- 
cess. This is usually necessary because the interviewers are 
not headache experts, and not able themselves to apply 
diagnostic reasoning. When there are multiple interviewers, 
an algorithm ensures uniformity in the process. 
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The algorithmic flow is important. Strictly applied, the 
criteria for ICHD diagnoses may be mutually exclusive, 
but this exclusivity tends to be lost with "probable" diag- 
noses and, especially, with modifications necessarily made 
to these criteria (see page 19). ICHD actually recognizes 
that a single headache type can meet more than one set of 
criteria, and counters this possibility by instructing that 
diagnoses should be made in a hierarchical sequence 
[11,30]. Secondary headaches, including MOH, should be 
diagnosed before primary headaches, and migraine should 
be diagnosed before TTH. A diagnosis of definite migraine 
is given when all criteria are met for migraine and, other- 
wise, the algorithm should accord precedence to TTH 
over probable migraine and, finally, probable TTH. 

A diagnostic algorithm following these principles is at- 
tached to the diagnostic module of the HARDSHIP ques- 
tionnaire (see page 21), which is published alongside these 
recommendations [29]. 

How many diagnoses in each individual? 

Questionnaires can address only one headache disorder 
at a time, while participants in epidemiological surveys 
may have multiple headache types and, accordingly, more 
than one headache diagnosis may be applicable in each in- 
dividual. Since headache diagnosis is based on history and 
recall, there is potential for confusion if a participant has 
more than one headache type in mind, and this needs to 
be eliminated as far as possible. Furthermore, it may be 
difficult for the participant with several headache types to 
remember which features belong to which headache type, 
and attribute them accordingly. For the interviewer, it is a 
great challenge to sort this out correctly. Headache spe- 
cialists are trained to do this in the clinic, but otherwise, 
and certainly in surveys, it is problematic. If only one diag- 
nosis is to be made when more than one headache dis- 
order is present, which should it be? A solution, valid for 
public-health purposes, is to ask the participant to identify 
in his or her mind the most bothersome headache (ie, the 
one that the participant believes interferes most with his 
or her life), and focus solely on this when responding. This 
language will tend to favour migraine, which is likely to be 
the most bothersome headache when multiple types are 
present (see discussion in the next section). 

This approach is adopted in the HARDSHIP question- 
naire (see page 21) [29]. 

Relevance of diagnosing different headache types 

This approach carries a small penalty. Prevalence studies 
that focus on the most bothersome headache will under- 
count headache types that tend to be less bothersome: 
for example, almost all participants who have both migraine 
and TTH will focus on the former, so that TTH is system- 
atically neglected. This should be acknowledged. There is 



an argument for extrapolating the reported prevalence of 
TTH in those without migraine to those with. 

Of course, a questionnaire can be applied a second 
(and third) time to include all headaches that participants 
can separately identify. This has been done [33,34], but 
most participants find it somewhat trying. When it is 
done, it should be reported how the multiple diagnoses in 
single individuals overlap (ie, how many participants have 
migraine only, how many TTH only, how many migraine 
and TTH). 

Making multiple diagnoses markedly increases the num- 
ber of questions that must be asked. This tends to discour- 
age participation, and leads to a lower participation rate. 
All studies should carefully assess the value of making spe- 
cific headache diagnoses. For example, and in particular, 
what is gained in a study to support needs-assessment by 
distinguishing between migraine with and without aura? 
For the purpose of assessing population headache burden, 
to distinguish at all between different headache types may 
be irrelevant. In fact, burden estimation in individuals with 
multiple headache types cannot, even when diagnoses are 
correct, realistically attribute burden in parts to each type. 
For public-health purposes, burden can be attributed to 
"headache" in general [35]. If, instead, it is attributed to 
the most bothersome headache reported by each individ- 
ual, it is unlikely that the resultant over-estimation as far 
as that headache type is concerned will be substantial. 

Validation of the questionnaire 

The key objective of validation is to prove the diagnostic 
capability of the diagnostic questionnaire, and is necessary 
whenever diagnostic accuracy is important and diagnoses 
are not made by headache experts in face-to-face inter- 
views. The importance of validation is increased because 
questionnaire diagnosis requires some modifications of 
ICHD diagnostic criteria (see page 19). 

Diagnostic validation is performed in a separate sample 
selected by identical methodology to the main sample, or 
in a randomly or consecutively selected sub-sample of 
participants in the main study. In the former case, validity 
can be confirmed before resources are committed to the 
main study; otherwise, the latter method conserves re- 
sources by not requiring additional participants. In either, 
stratified sampling according to caseness and diagnosis 
(no headache, migraine, TTH or MOH) may be employed 
to ensure adequate numbers with each diagnosis and 
without headache. Participants in the validation sample or 
sub-sample are re-interviewed and diagnosed by a head- 
ache expert applying his or her clinical skills, while being 
ignorant of the questionnaire diagnoses (a headache ex- 
pert in this context is a doctor who has clinical experience 
of diagnosing headache according to the ICHD criteria, 
and who knows the culture and is fluent in the native lan- 
guage^) of the population of interest). Re-interviewing 
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should be done soon (no more than 1 month) after the 
original diagnosis, so that any discrepancies are unlikely to 
be due to a change in the headache condition itself. 

Questionnaire-derived diagnoses in headache cases are 
compared with those of the expert(s), the latter being 
considered the "gold standard". This allows calculation 
of sensitivity, specificity, positive and negative predictive 
values and kappa value of the diagnostic instrument for 
each diagnosis [36]. The precision (ie, the width of the 
confidence interval) of the estimate for a certain diagno- 
sis is dependent both on the total number of persons in- 
cluded in the validation study and on the number of 
persons with the diagnosis. Validation studies performed 
in the last decade typically include from 180 to 500 per- 
sons drawn randomly from the main study, and have 
given relatively narrow confidence intervals for all the 
most common headache diagnoses [14,15,36-38] except, 
perhaps, MOH [36]. Validation of the screening question 
("Have you had headache ...?") is achieved in those who 
answered "No" to it. 

Validations preferably employ face-to-face interviews 
by the headache expert, but telephone interviews lose lit- 
tle and are an acceptable resource-conserving alternative 
when respondents are widely spread geographically. 

Validation cannot be performed among headache pa- 
tients in clinic, who are not representative of the popula- 
tion of interest in terms of their headache disorders and, 
probably, in a range of other relevant factors. They com- 
monly have more knowledge of headache, and perform 
differently from non-clinic populations when answering 
questionnaires because they have rehearsed their histories. 

As noted previously, validation is not always possible 
[3]. In countries where there are no headache experts, 
there is no gold standard available. What should be avoided 
in such circumstances is the invention of a new and un- 
tested questionnaire; instead, a diagnostic questionnaire 
that has been used and validated in multiple languages and 
cultures can be assumed to have some validity in countries, 
in languages (subject to good translation) and among 
cultures where direct validation cannot be performed. 
The alternative is that research, which is of public-health 
importance, can never be commenced. 

A validation study can inform and optimize the diag- 
nostic algorithm: in other words, small amendments can 
be made to the algorithm and the comparison remade 
between questionnaire-derived and expert diagnoses, which 
may or may not lead to improvement. An example might 
be to exclude from the diagnostic criteria for migraine the 
minimum duration of 4 hours, which has been found em- 
pirically to cause problems [2]. 

What is optimal depends on the purpose of the study. 
There is a trade-off between sensitivity and specificity, in 
the sense that as one increases the other decreases. In 
genetic studies, for example, where diagnostic certainty 



is most important, the algorithm that gives the highest 
specificity may be preferred. For most purposes, the al- 
gorithm that gives the highest overall diagnostic accur- 
acy (sum of sensitivity and specificity) is preferable. 

The HARDSHIP questionnaire 

The diagnostic module of this survey instrument, pub- 
lished as an example alongside these recommendations 
[29], follows the principles set out above. Based origin- 
ally on ICHD-II and now on ICHD-3 beta, with neces- 
sary modifications (see page 19), it has been validated 
against headache experts' diagnoses in multiple and dif- 
ferent settings. 

HARDSHIP first enquires into the presence or absence 
of headache, ever and in the previous year (screening 
questions). When headache is reported, the question- 
naire recognizes the diagnostic difficulty when headache 
occurs on >15 days/month (see page 18), and that making 
more than one diagnosis in each individual is problematic. 
It assumes that, from a public-health point of view, the im- 
portant diagnoses are migraine and TTH, but also ac- 
knowledges that a very large burden is carried by people 
with headache on >15 days/month. Hence, it specifically 
identifies these people, and asks them about possible medi- 
cation overuse (which of itself is of public-health import- 
ance). In all others, headache is assumed to be episodic. 
The next question asks whether participants believe they 
have only one type of headache or more than one; when 
there is more than one, subsequent questions are directed 
to the most bothersome type (see page 20), and it is made 
explicit that, because only one headache type is reported 
by each participant, diagnoses are mutually exclusive. 

Pilot study 
Principles 

A pilot study: 

• Identifies problems before full commitment of 
resources, and therefore: 

• Supports optimum study design, logistic planning 
and effective conduct of the main study; 

• Avoids potentially major wastage of resources; 

• Is ethically desirable; 

• Helps in training of research staff, especially 
interviewers. 

Commentary 

It is both highly sensible and ethically desirable to per- 
form a pilot study in order to detect and forestall prob- 
lems in the main study. Unlike a pre-pilot study (see 
page 21), a pilot study cannot use a patient sample. It 
may, for logistic reasons, employ an element of conveni- 
ence sampling (eg, being restricted to nearby geographical 
areas), but otherwise it uses all of the proposed methods 
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of sampling, access, engagement and data collection. It 
tests practicalities, comprehensibility of the instrument 
and overall feasibility of the study. With regard to the in- 
strument, it should identify which items (questions or re- 
sponses) interviewers may need to clarify, and the extent 
and nature of clarifications to be allowed. With regard to 
feasibility, it should provide estimates of the expected 
non-contact and non-participation rates. 

Burden estimation 

Principles 

Enquiries into burden should be: 

• Relevant; 

• Comprehensive (in relation to the purpose of 
the study); 

• Comprehensible to the respondent; 

• Amenable to analysis. 

Commentary 

Relevance and comprehensiveness 

"Relevance" implies that measured burden must be at- 
tributable to headache, and not to any other cause (in- 
cluding comorbidity). Otherwise the enquiry is spurious 
and findings are invalid. For this reason, inclusion of in- 
struments to register important comorbid conditions is 
warranted (eg, psychiatric comorbidity). 

"Comprehensiveness" requires that all relevant compo- 
nents are measured in order to give a full account. How- 
ever, the purpose of the study can legitimately restrict the 
enquiry to specific elements (for example, financial bur- 
den). When this is so, explanation is necessary. 

All or any of the components of burden described earl- 
ier (see page 3) may be relevant and appropriate objects of 
enquiry in burden-of-headache studies. 

Comprehensibility and analysis 

Most but not all of the elements of burden are amenable 
to enquiry by well-designed questionnaire. The HARDSHIP 
questionnaire [29] (see page 21), for example, incorporates 
enquiry into most of them in a modular design, which 
means the different elements can be included or not. It has 
been tested for comprehensibility and successfully em- 
ployed in different cultures and languages in Russia, China, 
India, Pakistan and ten countries of the European Union 
[14-17,39] and in Saudi Arabia, Zambia and Ethiopia (not 
yet published). In the commentary below, references are 
made to specific questions in HARDSHIP. 

Some elements of burden are not quantifiable, but 
amenable only to qualitative (descriptive) analysis. 

Symptom burden 

See HARDSHIP questions 14, IS, 20, 21/23, 24, 29-32, 
36 and 37. 



Pain burden can be quantified at individual level as a 
product of intensity, frequency and duration, and at popu- 
lation level as the product of the average among individuals 
and prevalence. But it is subjective, quantity estimates are 
inexact and there are no meaningful units in which it may 
be expressed. Other symptoms such as nausea, photopho- 
bia and phonophobia are almost impossible to quantify at 
individual let alone population level, but occurrence of 
each can be recorded and expressed as a frequency. 

Disability burden 

See HARDSHIP questions 38-44 and 58-62. 

There is correlation between symptoms and disability, 
which allows the latter to be measured in place of the 
former. Arguably, disability and lost productivity occur 
in addition to symptoms, so measurement of disability 
instead of symptoms underestimates their impact. Less 
arguable is that disability occurs only as a consequence 
of symptoms, and to measure both is a form of double- 
counting. 

Disability attributed to headache is also difficult to quan- 
tify completely. Common proxies are lost time and reduced 
productivity, for which well-validated instruments exist 
[40,41]. Both can be expressed in meaningful units (days/ 
year), but both are problematic. Lost time (absenteeism 
from paid work, for example) reflects behavioural response 
to impairment rather than disability: except in extreme 
cases, absenteeism is a choice. Reduced productivity is 
rarely exactly measurable, and at best can be approximately 
estimated by the person affected. A better measure con- 
ceptually may be the disability-adjusted life year (DALY), 
or the disability component of it, the YLD (year of life lost 
to disability). Attribution of YLDs to headache depends, es- 
sentially, on prevalence estimation and disability weighting. 
Since disability weights (DWs) are diagnosis-specific, the 
process is underpinned by diagnostic accuracy. Revised 
DWs have been developed as part of the Global Burden of 
Disease Study 2010 [1]. 

Individual financial burden 

To the extent that financial cost arises from lost wages, 
as a consequence of absenteeism, it should be directly 
measurable and easily expressed, and is a relevant and 
important component of the burden of headache for many 
people. However, housewives and unemployed people may 
have no income to lose. In other cases, this cost is only a 
small part of the personal burden because it is largely 
borne by employers or insurers, contributing instead to the 
very high societal economic burden (see page 23). 

A personal cost burden may also arise from health-care 
resource consumption (see page 23). 

Interictal burden 

See HARDSHIP questions 64-66, and 90-101. 
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The importance of interictal burden lies in the fact 
that it is continuous, rather than present only during at- 
tacks occurring perhaps every 30 days. This means both 
that it should not be ignored and that, if over-estimated, 
it will distort overall burden quantification. 

Interictal burden impairs wellbeing and, potentially, qual- 
ity of life. It is perhaps adequately, if not specifically, cap- 
tured by measures of these. 

Cumulative burden 

See HARDSHIP questions 51-57. 

It would appear that components of burden that accrue, 
cumulatively, over a lifetime cannot be fully assessed until 
late in lifetime. Furthermore, attribution may be difficult 
and uncertain. Nevertheless they are of obvious import- 
ance in a full account of burden of headache, and can be 
addressed in questionnaires. 

Overall individual burden 

See HARDSHIP questions 67-74. 

An overall summary measure of burden is unlikely to 
be comprehensive, but the concept is attractive for its 
simplicity. It may also be useful for specific purposes. 

One such measure is willingness-to-pay (WTP). The 
enquiry is along the lines of: "Imagine that there is a 
treatment you can buy. If you take it, your headaches 
will no longer bother you. How much would you be will- 
ing to pay (per week or month) for this treatment?" Re- 
sponses are usually adduced through the "bidding game" 
method. This begins by first asking whether the respond- 
ent would pay a modest amount (set in context) for the 
treatment. If the answer is "yes", the interviewer incre- 
ments the bid until the answer is "no"; the last sum of 
money receiving a "yes" response is the WTP result. If the 
initial answer is "no", the interviewer reduces the bid in 
steps until the respondent says "yes"; the first sum of 
money receiving a "yes" response is then the WTP result. 

This form of enquiry has been used to assess sustain- 
ability of health-care initiatives in resource-poor countries 
[42]. It has wider potential applicability, since it puts a 
value, in monetary terms, to subjectively perceived burden 
in all its elements. Use for this latter purpose requires val- 
idation, especially because willingness-to-pay is clearly not 
independent of ability-to-pay. In addition, there may be a 
disconnection between what people say they will pay and 
what they actually will pay when confronted by the pro- 
spect in reality. 

Burden on others, including carer burden 

See HARDSHIP questions 75-86. 

Again, this can be addressed in questionnaires, although 
subjective interpretations are unavoidable. A full account 
may necessitate enquiries among the others, which in 



practical terms may be possible only among close family 
members. 

Lost work time of carers is potentially important. 

Health-care resource consumption 

See HARDSHIP questions 45-50. 

It is relatively easy to enquire into health-care usage, 
and to establish who pays for it (the patient, insurers or 
society via the State). It is not so easy to attach accurate 
costs to individual items of health care when these costs 
are not borne directly, out of pocket, by the patient. This 
needs to be done, however, and may necessitate separate 
research into health-care costs in the country or region 
in question [6]. 

Societal economic burden 

See HARDSHIP questions 58 and 59. 

By far the greater part of the financial cost of headache is 
the indirect cost of absenteeism and reduced effectiveness 
at work. This cost may be borne by individuals (see above); 
but commonly it falls largely, if not entirely, upon em- 
ployers and/or insurers and is a cost to national economies 
and becomes a part of the societal economic burden. If dis- 
ability measures use absenteeism and reduced effectiveness 
at work as proxies (see above), reasonable estimates of this 
cost can be made with knowledge of average wage-rates 
(age- and gender-corrected) for the population in question. 

More is said about societal economic burden in the 
next section. 

Studies focused on economic burden of headache 
Principles 

Comprehensive enquiries into economic burden: 

• Should include both direct and indirect costs; 

• Must have reliable data sources for attributing costs; 

• Must be based on population studies, not patient 
samples; 

• Should preferably be planned and performed in 
collaboration with a health economist. 

Commentary 

There are references above to financial costs of head- 
ache, which may fall either on individuals or directly on 
society. A full account of financial cost includes both. It 
also includes both direct costs (what the patient or some- 
one else has to pay for) and indirect costs (losses of in- 
come or productivity) (see below). Describing the burden 
of headache on society in purely economic terms has two 
main purposes: 

a) It allows evaluation of headache disorders in 
comparison with other diseases of public-health 
importance laying claim to health-care resources; 
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b) It may show how much can be gained by improving 
health-care services for headache patients. 

Direct costs 

Direct costs include the costs of medicines and other 
treatments, of consultations (including transportation 
costs when these are necessarily incurred), of diagnostic 
investigations and of hospitalizations. They may be cov- 
ered by the patient, by the State, by a commercial health- 
insurance system or in parts by any two or all three. 

Indirect costs 

Indirect costs are dominated by loss of productive work 
time, due either to absence from work or to reduced ef- 
ficiency when working with headache. This cost may be 
borne by individuals (see page 23), but in many societies 
it falls mostly upon employers and/or State or commer- 
cial insurers. 

Other types of indirect cost are hard to capture, and 
therefore rarely described: among these are lost career 
opportunities because headache has interfered with school 
and work (a type of cumulative burden [see page 23]) and 
lost productivity among partners who become carers. 

Measuring economic burden 

See HARDSHIP questions 17, 18, 39-50, 58-62 and 80. 

If appropriate items of economic relevance are in- 
cluded in the questionnaire, it is possible to estimate the 
cost of headache by the "bottom-up" approach: ie, cost 
is calculated for each person with headache in the sample, 
and all costs are summed to give the total cost within the 
sample. This can be extrapolated to the whole population 
of interest. 

The bottom-up method requires a price tag to be placed 
on each item (eg, on each dose of different medicines, on 
each consultation, on each diagnostic investigation, on a 
day and/or night in a hospital bed, on a lost work-day, 
etc.). These prices must be obtained from reliable sources. 
The method is generally viewed as more inclusive and ac- 
curate than the alternative top-down approach, by which 
the overall cost is calculated from publicly available statis- 
tics (eg, total sales of migraine medicines, summary statis- 
tics on health-care utilization and records of sick-leave 
due to headache). 

Ideally, bottom-up cost estimates are made through 
population-based studies in which both prevalence of 
headache and associated costs are determined simultan- 
eously. Costs for this purpose cannot be assessed in pa- 
tients (users of health-care resources) because they are 
not a representative group; they usually have more severe 
or complex headache disorders, so that extrapolation to a 
broader population of interest will give a spuriously high 
total cost [6], and they may be self-selecting (biased) in 
other ways. 



It is recommended to involve a skilled health econo- 
mist in the design, planning and analysis of a study fo- 
cusing on economic burden. 

Timeframe of burden estimation 

As observed earlier, the limitations of recall are an im- 
portant factor in the generation of information bias. 
They are greater in the elderly, so introducing differen- 
tial bias (see page 9). 

Recall of headache occurrences and, more so, recall of 
the characteristics and consequences of each occurrence 
are presumably more accurate over the previous 3 months 
than over the last year. For this reason, burden questions 
have commonly been limited to a 3-month timeframe, as 
in the Migraine Disability Assessment (MIDAS) [40] and 
Headache-Attributed Lost Time (HALT) [41] question- 
naires. The adoption of 3 months is a compromise, con- 
strained on one side by the limits of recall and on the 
other by the purpose of the enquiry. Both instruments 
have in mind the assessment of an individual patient for 
therapeutic reasons and, to establish that person's burden, 
must cover a period long enough to be representative. In 
large-group studies, quite different considerations apply: 
since 3 months may be too long for reasonably accurate 
recall of productivity losses, shorter periods of one month 
or even one week should be considered. 

Very short and recent timeframes, such as headache 
yesterday, avoid recall problems almost altogether. While 
an estimate of 1-day prevalence does not describe the 
proportion of the population with an active headache 
disorder, enquiries focused on headache yesterday can yield 
very accurate information on burden. If the sample is large 
enough (1-day prevalence of an episodic headache disorder 
being obviously much lower than 1-year prevalence), they 
offer a rather precise estimate of the level of headache suf- 
fering in the population on a particular day, and therefore 
on any day. Hence headache yesterday can very accurately 
describe the population headache burden. 

Data collection and storage 

Principles 

Data management and storage processes should: 

• Be designed to minimize error; 

• Be carried out meticulously; 

• Respect the privacy of participants and accord with 
the laws and regulations for data storage in the 
country. 

Commentary 

Huge effort and resource-expenditure go into error-free 
data collection; errors should not be introduced subse- 
quently by less-than-meticulous data management. 
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However collected, data should be recorded in the most 
practical way: usually initially on paper, but sometimes dir- 
ectly into computer. In the former case, data must later be 
transferred from the source documents to computer and, 
in this process, rigorous quality controls are essential. 
Ideally, all data are entered twice, by two independent per- 
sons (double data entry), producing two datasets that are 
compared to detect errors, each of which is resolved by 
reference to the relevant source document. As a resource- 
conserving minimum, a subset of the data (eg, 10% of 
items) is entered twice and, by comparison of the two data 
subsets, an error rate is estimated. Taking due account of 
the uncertainty surrounding this estimate, a decision can 
then be made as to whether this error rate is acceptable 
(ie, will have negligible influence on the results to be re- 
ported); if not, full double data entry is necessary. 

As always, personal data should be handled and stored 
safely, respecting the privacy of participants and in ac- 
cordance with the laws and regulations for data storage 
in the country (see Ethical Issues, page 4). 

Special issues 

Studies of special groups 

Some studies concern not the general population but, 
instead, particular subgroups of it (eg, children and/or 
adolescents, the elderly). These may introduce additional 
issues for consideration. 

Children and adolescents 

In countries with obligatory schooling, representative sam- 
ples of children of school age can be obtained by selecting 
all, or a random sample, of the pupils of a representative 
sample of schools (a cluster-sampling method). These 
schools should reflect the variation of the country or region 
with regard to socioeconomic status and area of habitation 
(rural or urban). Access is greatly facilitated, since the pupils 
can be contacted while at school. Engagement, and inter- 
views, may be undertaken by specially trained interviewers, 
or by a teacher or school nurse. For small children, some 
information will have to be given by parents or teachers. 

For studies on children or youths below the age of major- 
ity, consent to participation must be obtained beforehand 
from parents or guardians. This requirement cannot be 
bypassed; but, if consent is sought by letter from or through 
the school - the usual method - there may be a major dim- 
inution of participation-rate through non-response. The ap- 
parent advantages of sampling via schools, rather than by 
door-to-door visits in the community, may be outweighed 
by this factor. 

Somewhat different diagnostic criteria for migraine apply 
in children (although still based on ICHD). Diagnostic in- 
struments require modification accordingly, and these and 
other measuring instruments must reflect the language 
and comprehension skills of the age group. Validation of 



the diagnostic method is as important as in studies on 
adults, and the headache expert in these studies should 
have experience in diagnosing headache in children. 

The elderly 

Studies in elderly populations (usually defined as >65 years 
of age) should take note of the following, all to be expected: 

• Lower prevalences of primary headache disorders; 

• Higher prevalences of secondary headaches; 

• More comorbidities; 

• Reduced mental capacity (potentially affecting 
engagement and data collection); 

• Reduced physical capacity and less participation in 
the workforce (potentially affecting the way in which 
headache has impact, and, therefore, estimates of 
burden). 

For these reasons, more factors come into play when 
attempting to draw a representative sample in this age 
group, and the sample size may need to be enlarged. The 
enquiry may need to focus more on secondary headaches, 
at least to avoid their misdiagnosis as primary headache 
disorders when the latter are the object of interest. It may 
also need to give greater consideration to comorbidities, 
which may be confounders in estimations of burden. Ad- 
justments for comorbidities will be more important in the 
analysis stage. 

Studies of rare headaches 

The prevalence of rare headaches is difficult to assess 
because a large sample size is needed (see page 11). Also, 
because of limited experience with such headaches, they 
may be difficult to identify through self-administered 
questionnaires; a headache expert must usually make the 
diagnosis. Such a study has been performed for cluster 
headache [43]. 

Rare headaches are unlikely to be of public-health inter- 
est unless severe, which opens up a possible alternative 
method for case-ascertainment: the screening of large 
clinical populations (eg, general practitioners' clinical files) 
for potential cases, and then a more thorough clinical ex- 
pert interview of these for diagnostic confirmation [44]. 
This may work well for severe headache disorders such as 
cluster headache or trigeminal neuralgia that demand 
medical attention, in societies with good access to primary 
care and well-kept electronic (ie, searchable) records. 
Diagnoses (and even symptoms) may not be correctly re- 
corded but, nonetheless, case-ascertainment through care- 
ful searching can be virtually complete. The denominator 
in such cases is the general population. 

Most published prevalence estimates of rare headaches 
are of lifetime prevalence, because cases are more numer- 
ous with this definition of caseness (see page 18); but 
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current prevalence (1-year or 1-day [see page 4]) is likely 
to be of greater interest. 

Stand-alone studies versus large health surveys 

These recommendations call for a relatively long ques- 
tionnaire, suited to stand-alone studies that are dedicated 
wholly to assessing headache prevalence and burden. For 
many scientific purposes it is both feasible and interesting 
to include headache in large health surveys that aim to in- 
vestigate multiple disorders and/or other health-related 
issues. 

Large health surveys assess headache in a wider context 
and may include very large populations because more re- 
sources are put into them. They can include comorbidities 
and associations with potential causes and risk factors, an 
aspect of enquiry that stand-alone studies are rarely capable 
of. On the other hand, large health surveys cannot feasibly 
include a full burden-of-headache questionnaire; instead, 
the necessary minimum is a valid screening question for ac- 
tive headache disorder, some questions allowing determin- 
ation of severity (frequency, intensity and duration), and 
ideally a question set for the diagnosis of migraine and 
TTH (as mutually exclusive diagnoses). Validation of these 
questions remains as important as in stand-alone studies. 

Reporting the study 

Principles 

• Poor reporting diminishes the value of otherwise 
good studies. 

• Reference to the STROBE statement is 
recommended. 

• Methods should be reported in sufficient detail to 
permit replication (which may require that they be 
published separately). 

• Results should be reported in accordance with the 
study's purpose(s). 

• Discussion should show that the study objective was 
achieved, or explain why it was not. 

Commentary 

All too often, the value of otherwise good studies, con- 
suming substantial resources, is diminished because they 
are poorly reported. This is true both for methods and 
results. The STROBE statement gives excellent recom- 
mendations for how epidemiological studies should be 
reported [7,8]. Particularly relevant for headache preva- 
lence and burden studies are the parts on cross-sectional 
studies. 

The introduction 

This should briefly state the scientific question or hy- 
pothesis addressed by the study (objective), why this was 
of interest and the study done, the scientific background 



(what was known beforehand) and any argument(s) that 
justified the choice of study design (eg, cross-sectional, 
case-control, cohort). 

Description of methods 

Methods need to be reported in sufficient detail to per- 
mit replication of the study. Especially for large studies 
likely to result in several papers, it may therefore be ad- 
visable to write a first and separate methodology paper, 
so that subsequent papers can refer to it while concentrat- 
ing on the results. The validation study, and its results, 
should usually be part of this. When a non-participation 
study has been performed, the methodology paper should 
also contain the results of this and discuss their potential 
impact on the main results. 
In all cases, necessary items for inclusion are: 

• Ethics approval; 

• Study design (eg, cross-sectional, case-control, 
cohort); 

• Case definition; 

• How the population of interest was defined; 

• How the study sample was selected; 

• The sampling method, and how participants were 
identified, accessed and engaged; 

• Sample size calculation; 

• The number of interviewers, their background, and 
training; 

• The study instrument (to be adequately described), 
including the exact wording of the screening 
question; 

• Description and results of any validation study; 

• The diagnostic algorithm; 

• In a separate statistical section, a description of 
statistical methods and an account of subgroup 
analyses (each clearly identified as planned before 
the study or conceived after review of the data, as 
applicable). 

Presentation of results 

The participation rate (see page 14) is an important re- 
sult and should always be reported. Further, all reasons 
for non-participation (see Figure 1) should be given, and 
a flowchart of participation should account for everyone 
in the pre-selected sample (Figure 2). 

Principal findings need to be set out in tables and not 
only as graphics (pie or bar charts, etc.), because it is 
often difficult to extract exact values from the latter. 

Prevalence estimates must specify their timeframe (see 
page 17). They should be reported for all headache and/ 
or each of the headache types or subtypes of interest. It 
should be clear whether diagnoses were mutually exclu- 
sive or, alternatively, one person could be diagnosed with 
multiple headache types (or subtypes). In the former 



Table 3 Criteria for evaluating the quality of headache prevalence studies 



Quality factor Score 



Sampled population 

Sampling method 

Number of 
respondents 

Participation rate 

Access 



Validation of 

diagnostic 

instrument 



Diagnostic criteria, 
and application of 
"probable" diagnoses 

Prevalence time frame 



Not stated, or clinic 
population, or members 
of patient organization 

Not stated 



Not stated 

Not stated, or 
Not stated 



Instrument not specified 
or not validated 



Not stated 



Selected population (eg, health-plan 
members, work-place, college 
students) [increase score by 2 if this 
met the specific purpose of the study] 

No (or failed) attempt to secure 
representativeness 

250-1,000 

50-59% 

Self-administered (unsupervised) 
questionnaire 

Validated, but sensitivity and/or 
specificity <60% 



Stated, other than ICHD 



Not specified or not appropriate 
to the study purpose 



1,000-1,500 



60-69% 



Telephone or face-to-face 
interview by untrained or 
unspecified interviewer(s) 

Validated, but sensitivity 
and/or specificity <70% 



ICHD (or reasonable 
modification), but uncertain 
or inappropriate analysis of 
"probable" diagnoses 

Not specified, but 
terminology implies 
"present" or "current" or 
"recent" headache 



General population or community- 
based sample from defined region 
within a country, or school-based 
(for children) 

Random sample uncorrected for 
population demographics 

1 ,500-2,500 

70-79% 

Telephone or face-to-face interview 
by trained lay interviewer(s), medical 
students or nurses 

Validated only in screen-positive 
sub-sample, or in clinic or unspecified 
sample, but sensitivity and 
specificity >70% 



Other specified period appropriate 
to the study purpose 



General population or community- 
based sample from whole country 



Total defined population, or random 
sample corrected for population 
demographics 

>2,500 



>80% 

Face-to-face interview with headache 
expert [reduce score by 1 if interviews 
restricted to screen-positive sub-sample] 

Validated in target population or similar, 
and sensitivity and specificity >70%, or all 
diagnoses made in face-to-face or 
telephone interviews by headache expert 

ICHD (or reasonable modification) with 
clear exposition regarding "probable" 
diagnoses 

Point, 1-day, 1-year or lifetime 



Possible range is -27 to +32. 

Quality thresholds: <9: unacceptable (data of no value); 9-23: acceptable when no other data are available; 24-27: good; >28: very good. 
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case, it should be easy for the reader to add up the types 
or subtypes to 100% of all headache (see page 20). Esti- 
mates should be reported with descriptors of uncertainty 
{eg, 95% confidence intervals), and both unadjusted and 
adjusted for differences between the age- and gender- 
compositions of the sample and those of the population 
of interest, if known. They should further be reported 
for each gender separately and for both combined, and, in 
each of these groups, according to age categories (usually 
each 5 or 10 years). Additional breakdowns (eg, urban vs 
rural habitation, educated vs uneducated, wealthy vs poor) 
may be of interest. Multivariate analyses will show which 
of these have true associations (but not that they are risk 
factors since causation has not been established). 

Missing data should be described and enumerated. 
Always, some participants do not answer all questions 
and/or some answers are unintelligible or otherwise lost. 
The amount of missing data can affect generalizability of 
the results. 

Reports of prevalence alone are of limited interest: 
some measures of burden should be reported in most 
prevalence studies. For whatever are reported, it is often 
of interest to differentiate between genders, age groups, 
urban and rural populations, wealthy and poor, etc. 

The discussion 

This should summarize the key results, focusing first on 
those relevant to the protocol-defined hypotheses and 
objective, and their analysis, then on secondary hypotheses 
generated by the data. Methodological limitations and 
strengths should be evaluated. In particular, it is important 
to discuss the effect the participation rate, and all reasons 
for non-participation, may have had on the representative- 
ness of the study sample, talcing into consideration the re- 
sults of the non-participation study, if one was performed. 
Other potential sources of bias should also be carefully 
evaluated. This should lead to an assessment of the gen- 
eralizability of the results, and the influence of all meth- 
odological limitations on the conclusions to be drawn. 

Finally, the results should be put in the context of what 
was previously known, identifying the new knowledge that 
has been acquired and the value and the implications of it 
for people with headache, health-care providers and/or 
health-care policy-makers. It is sometimes helpful to sug- 
gest new studies that are needed in the light of this new 
knowledge. 

Evaluating the quality of studies 

These principles can be applied not only to the design 
and conduct of new studies but also to the quality evalu- 
ation of published studies. 

A scoring instrument has been developed for this purpose 
(see Table 3). It emphasizes representativeness of the sam- 
ple with regard to the population of interest, the sampling 



method, sample size, participation rate, access, validation, 
use of diagnostic criteria and description of timeframe. Each 
quality factor has a maximal score of 4, and the maximal 
total score is 32. For factors crucial to study quality, a heavy 
penalty of -4 is applied when no information at all is pro- 
vided: serious flaws in any of these can irredeemably invali- 
date the results, even when all other aspects of the study 
have been adequately performed and reported, and the 
quality assessment must reflect this. 

Conclusions 

Population-based burden-of-headache studies contribute 
essentially to our understanding of disease origins, patterns, 
aetiology and risk factors, inform needs-assessment and un- 
derpin service policy; they must be of high quality. Many 
factors that influence quality have been noted to be inad- 
equately addressed in past studies [3] . These, and others of 
potential importance, are made the subject of recommen- 
dations accompanied by detailed explanatory commentary. 
The expertise contributing to this document was extensive, 
both theoretical and practical, and drawn from all regions 
of the world; while it is not a textbook on epidemiology 
(general principles are dealt with elsewhere), the recom- 
mendations should provide helpful and authoritative guid- 
ance for most purposes related to headache. 
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